Data Mining Methods and Models

Daniel T. Larose

  • 出版商: Wiley
  • 出版日期: 2006-01-30
  • 售價: $950
  • 語言: 英文
  • 頁數: 344
  • 裝訂: Hardcover
  • ISBN: 0471666564
  • ISBN-13: 9780471666561
  • 相關分類: Data-mining
  • 下單後立即進貨 (約5~7天)

買這商品的人也買了...

商品描述

Description  

Apply powerful Data Mining Methods and Models to Leverage your Data for Actionable Results

Data Mining Methods and Models provides:

  • The latest techniques for uncovering hidden nuggets of information
  • The insight into how the data mining algorithms actually work
  • The hands-on experience of performing data mining on large data sets

Data Mining Methods and Models:

  • Applies a "white box" methodology, emphasizing an understanding of the model structures underlying the softwareWalks the reader through the various algorithms and provides examples of the operation of the algorithms on actual large data sets, including a detailed case study, "Modeling Response to Direct-Mail Marketing"
  • Tests the reader's level of understanding of the concepts and methodologies, with over 110 chapter exercises
  • Demonstrates the Clementine data mining software suite, WEKA open source data mining software, SPSS statistical software, and Minitab statistical software
  • Includes a companion Web site, www.dataminingconsultant.com, where the data sets used in the book may be downloaded, along with a comprehensive set of data mining resources. Faculty adopters of the book have access to an array of helpful resources, including solutions to all exercises, a PowerPoint® presentation of each chapter, sample data mining course projects and accompanying data sets, and multiple-choice chapter quizzes.

With its emphasis on learning by doing, this is an excellent textbook for students in business, computer science, and statistics, as well as a problem-solving reference for data analysts and professionals in the field.

 

Table of Contents

Preface.

1. Dimension Reduction Methods.

Need for Dimension Reduction in Data Mining.

Principal Components Analysis.

Factor Analysis.

User-Defined Composites.

2. Regression Modeling.

Example of Simple Linear Regression.

Least-Squares Estimates.

Coefficient or Determination.

Correlation Coefficient.

The ANOVA Table.

Outliers, High Leverage Points, and Influential Observations.

The Regression Model.

Inference in Regression.

Verifying the Regression Assumptions.

An Example: The Baseball Data Set.

An Example: The California Data Set.

Transformations to Achieve Linearity.

3. Multiple Regression and Model Building.

An Example of Multiple Regression.

The Multiple Regression Model.

Inference in Multiple Regression.

Regression with Categorical Predictors.

Multicollinearity.

Variable Selection Methods.

An Application of Variable Selection Methods.

Mallows’ C p Statistic.

Variable Selection Criteria.

Using the Principal Components as Predictors in Multiple Regression.

4. Logistic Regression.

A Simple Example of Logistic Regression.

Maximum Likelihood Estimation.

Interpreting Logistic Regression Output.

Inference: Are the Predictors Significant?

Interpreting the Logistic Regression Model.

Interpreting a Logistic Regression Model for a Dichotomous Predictor.

Interpreting a Logistic Regression Model for a Polychotomous Predictor.

Interpreting a Logistic Regression Model for a Continuous Predictor.

The Assumption of Linearity.

The Zero-Cell Problem.

Multiple Logistic Regression.

Introducing Higher Order terms to Handle Non-Linearity.

Validating the Logistic Regression Model.

WEKA: Hands-On Analysis Using Logistic Regression.

5. Naïve Bayes and Bayesian Networks.

The Bayesian Approach.

The Maximum a Posteriori (MAP) Classification.

The Posterior Odds Ratio.

Balancing the Data.

Naïve Bayes Classification.

Numeric Predictors for Naïve Bayes Classification.

WEKA: Hands-On Analysis Using Naïve Bayes.

Bayesian Belief Networks.

Using the Bayesian Network to Find Probabilities.

WEKA: Hands-On Analysis Using Bayes Net.

6. Genetic Algorithms.

Introduction to Genetic Algorithms.

The Basic Framework of a Genetic Algorithm.

A Simple Example of Genetic Algorithms at Work.

Modifications and Enhancements: Selection.

Modifications and enhancements: Crossover.

Genetic Algorithms for Real-Valued Variables.

Using Genetic Algorithms to Train a Neural Network.

WEKA: Hands-On Analysis Using Genetic Algorithms.

7. Case Study: Modeling Response to Direct-Mail Marketing.

The Cross-Industry Standard Process for Data Mining: CRISP-DM.

Business Understanding Phase.

Data Understanding and Data Preparation Phases.

The Modeling Phase and the Evaluation Phase.

Index.

商品描述(中文翻譯)

描述 





運用強大的數據挖掘方法和模型,利用您的數據獲得可行的結果


《數據挖掘方法和模型》提供:



  • 揭示隱藏信息的最新技術

  • 深入了解數據挖掘算法的工作原理

  • 實際操作大數據集進行數據挖掘的實踐經驗


《數據挖掘方法和模型》:



  • 採用“白盒”方法,強調對軟件底層模型結構的理解,並通過各種算法的實例演示在實際大數據集上的操作,包括詳細案例研究“建模響應直郵營銷”

  • 通過110多個章節練習測試讀者對概念和方法的理解程度

  • 演示Clementine數據挖掘軟件套件、WEKA開源數據挖掘軟件、SPSS統計軟件和Minitab統計軟件

  • 附帶網站www.dataminingconsultant.com,可以下載書中使用的數據集以及一套全面的數據挖掘資源。教師使用該書的讀者可以獲得一系列有用的資源,包括所有練習的解答、每章的PowerPoint演示文稿、數據挖掘課程的示例項目和相應的數據集,以及多選題章節測驗。


由於強調實踐學習,這是商業、計算機科學和統計學專業學生的優秀教材,也是數據分析師和該領域專業人士的問題解決參考書。



 





目錄



前言。


1. 維度降低方法。


數據挖掘中維度降低的需求。


主成分分析。


因子分析。


用戶定義的綜合指標。


2. 回歸建模。


簡單線性回歸的例子。


最小二乘估計。


決定係數。


相關係數。


方差分析表。


異常值、高杠杆點和有影響的觀測值。


回歸模型。


回歸的推論。


驗證回歸假設。


例子:《棒球》數據集。


例子:加利福尼亞數據集。


實現線性的轉換。


3. 多元回歸和模型構建。


多元回歸的例子。


多元回歸模型。


多元回歸的推論。


具有類別預測變量的回歸。