Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis (Springer Series in Statistics)

Frank E. Harrell Jr.

商品描述

This highly anticipated second edition features new chapters and sections, 225 new references, and comprehensive R software. In keeping with the previous edition, this book is about the art and science of data analysis and predictive modeling, which entails choosing and using multiple tools. Instead of presenting isolated techniques, this text emphasizes problem solving strategies that address the many issues arising when developing multivariable models using real data and not standard textbook examples. It includes imputation methods for dealing with missing data effectively, methods for fitting nonlinear relationships and for making the estimation of transformations a formal part of the modeling process, methods for dealing with "too many variables to analyze and not enough observations," and powerful model validation techniques based on the bootstrap. The reader will gain a keen understanding of predictive accuracy and the harm of categorizing continuous predictors or outcomes. This text realistically deals with model uncertainty and its effects on inference, to achieve "safe data mining." It also presents many graphical methods for communicating complex regression models to non-statisticians.

Regression Modeling Strategies presents full-scale case studies of non-trivial datasets instead of over-simplified illustrations of each method. These case studies use freely available R functions that make the multiple imputation, model building, validation and interpretation tasks described in the book relatively easy to do. Most of the methods in this text apply to all regression models, but special emphasis is given to multiple regression using generalized least squares for longitudinal data, the binary logistic model, models for ordinal responses, parametric survival regression models and the Cox semi parametric survival model. A new emphasis is given to the robust analysis of continuous dependent variables using ordinal regression.

As in the

first edition, this text is intended for Masters' or Ph.D. level graduate students who have had a general introductory probability and statistics course and who are well versed in ordinary multiple regression and intermediate algebra. The book will also serve as a reference for data analysts and statistical methodologists, as it contains an up-to-date survey and bibliography of modern statistical modeling techniques. Examples used in the text mostly come from biomedical research, but the methods are applicable anywhere predictive models ("analytics") are useful, including economics, epidemiology, sociology, psychology, engineering and marketing.

商品描述(中文翻譯)

這本備受期待的第二版新增了新的章節和部分,225個新的參考文獻,以及全面的R軟體。與前一版一樣,這本書講述的是數據分析和預測建模的藝術和科學,這涉及到選擇和使用多種工具。本書不僅僅介紹孤立的技術,更強調解決問題的策略,這些策略解決了在使用真實數據而不是標準教科書例子開發多變量模型時出現的許多問題。它包括有效處理缺失數據的插補方法,擬合非線性關係和使轉換估計成為建模過程的正式部分的方法,處理“變量過多,觀察不足”的方法,以及基於自助法的強大模型驗證技術。讀者將對預測準確性和將連續預測變量或結果進行分類的危害有深刻的理解。本書實際處理模型不確定性及其對推論的影響,以實現“安全的數據挖掘”。它還提供了許多圖形方法,以便將複雜的回歸模型傳達給非統計學家。

《回歸建模策略》以非簡化的案例研究代替了每種方法的過度簡化示例。這些案例研究使用免費提供的R函數,使書中描述的多重插補、模型構建、驗證和解釋任務相對容易進行。本書中的大多數方法適用於所有回歸模型,但特別強調使用廣義最小二乘法進行多元回歸的長期數據、二元邏輯模型、有序反應模型、參數生存回歸模型和Cox半參數生存模型。對使用有序回歸進行連續因變量的強健分析給予了新的重視。

與第一版一樣,本書適用於碩士或博士級的研究生,他們已經修過一門普通的概率和統計課程,並且對普通多元回歸和中級代數非常熟悉。本書還可作為數據分析師和統計方法學家的參考書,因為它包含了現代統計建模技術的最新調查和參考文獻。本書中的示例主要來自生物醫學研究,但這些方法在任何需要預測模型(“分析”)的領域都是有用的,包括經濟學、流行病學、社會學、心理學、工程學和市場營銷。