Regression Analysis with Python(Paperback)

Luca Massaron, Alberto Boschetti

  • 出版商: Packt Publishing
  • 出版日期: 2016-02-29
  • 售價: $1,860
  • 貴賓價: 9.5$1,767
  • 語言: 英文
  • 頁數: 312
  • 裝訂: Paperback
  • ISBN: 1785286315
  • ISBN-13: 9781785286315
  • 相關分類: Python程式語言
  • 下單後立即進貨 (約3~4週)

相關主題

商品描述

Key Features

  • Become competent at implementing regression analysis in Python
  • Solve some of the complex data science problems related to predicting outcomes
  • Get to grips with various types of regression for effective data analysis

Book Description

Regression is the process of learning relationships between inputs and continuous outputs from example data, which enables predictions for novel inputs. There are many kinds of regression algorithms, and the aim of this book is to explain which is the right one to use for each set of problems and how to prepare real-world data for it. With this book you will learn to define a simple regression problem and evaluate its performance. The book will help you understand how to properly parse a dataset, clean it, and create an output matrix optimally built for regression. You will begin with a simple regression algorithm to solve some data science problems and then progress to more complex algorithms. The book will enable you to use regression models to predict outcomes and take critical business decisions. Through the book, you will gain knowledge to use Python for building fast better linear models and to apply the results in Python or in any computer language you prefer.

What you will learn

  • Format a dataset for regression and evaluate its performance
  • Apply multiple linear regression to real-world problems
  • Learn to classify training points
  • Create an observation matrix, using different techniques of data analysis and cleaning
  • Apply several techniques to decrease (and eventually fix) any overfitting problem
  • Learn to scale linear models to a big dataset and deal with incremental data

About the Author

Luca Massaron is a data scientist and a marketing research director who is specialized in multivariate statistical analysis, machine learning, and customer insight with over a decade of experience in solving real-world problems and in generating value for stakeholders by applying reasoning, statistics, data mining, and algorithms. From being a pioneer of Web audience analysis in Italy to achieving the rank of a top ten Kaggler, he has always been very passionate about everything regarding data and its analysis and also about demonstrating the potential of datadriven knowledge discovery to both experts and non-experts. Favoring simplicity over unnecessary sophistication, he believes that a lot can be achieved in data science just by doing the essentials.

Alberto Boschetti is a data scientist, with an expertise in signal processing and statistics. He holds a Ph.D. in telecommunication engineering and currently lives and works in London. In his work projects, he faces daily challenges that span from natural language processing (NLP) and machine learning to distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.

Table of Contents

  1. Regression – The Workhorse of Data Science
  2. Approaching Simple Linear Regression
  3. Multiple Regression in Action
  4. Logistic Regression
  5. Data Preparation
  6. Achieving Generalization
  7. Online and Batch Learning
  8. Advanced Regression Methods
  9. Real-world Applications for Regression Models

商品描述(中文翻譯)

主要特點



  • 學會在Python中實施迴歸分析

  • 解決與預測結果相關的複雜數據科學問題

  • 掌握各種迴歸方法,進行有效的數據分析

書籍描述


迴歸是從示例數據中學習輸入和連續輸出之間的關係,從而實現對新輸入的預測的過程。有許多種迴歸算法,本書的目的是解釋每個問題集應該使用哪種算法以及如何為其準備現實世界的數據。通過本書,您將學習定義一個簡單的迴歸問題並評估其性能。本書將幫助您了解如何正確解析數據集,清理數據,並創建最適合迴歸的輸出矩陣。您將從一個簡單的迴歸算法開始解決一些數據科學問題,然後進一步發展到更複雜的算法。本書將使您能夠使用迴歸模型預測結果並做出重要的業務決策。通過本書,您將獲得使用Python構建快速更好的線性模型的知識,並將結果應用於Python或任何您喜歡的計算機語言。

您將學到什麼



  • 為迴歸格式化數據集並評估其性能

  • 將多元線性迴歸應用於現實世界問題

  • 學習對訓練點進行分類

  • 使用不同的數據分析和清理技術創建觀察矩陣

  • 應用多種技術減少(並最終修復)過度擬合問題

  • 學習將線性模型擴展到大數據集並處理增量數據

關於作者


Luca Massaron 是一位數據科學家和市場研究總監,專門從事多變量統計分析、機器學習和客戶洞察,擁有十多年解決現實世界問題並通過推理、統計、數據挖掘和算法創造價值的經驗。從成為意大利網絡觀眾分析的先驅到成為前十名的Kaggler,他一直對與數據及其分析有關的一切非常熱衷,並且對向專家和非專家展示數據驅動的知識發現的潛力也非常熱衷。他更傾向於簡單而不是不必要的複雜性,他相信只需做基本的數據科學就可以實現很多成就。


Alberto Boschetti 是一位數據科學家,擅長信號處理和統計學。他擁有電信工程博士學位,目前居住和工作於倫敦。在他的工作項目中,他每天面臨著從自然語言處理(NLP)和機器學習到分布式處理的各種挑戰。他對自己的工作非常熱衷,並且總是努力保持對數據科學技術的最新發展的了解,參加聚會、會議和其他活動。

目錄



  1. 迴歸 - 數據科學的工作馬匹

  2. 接近簡單線性迴歸

  3. 多元迴歸實踐

  4. 邏輯迴歸

  5. 數據準備

  6. 實現泛化

  7. 在線和批量學習

  8. 高級迴歸方法

  9. 迴歸模型的現實應用