Ensemble Machine Learning Cookbook: Over 35 practical recipes to explore ensemble machine learning techniques using Python

Dipayan Sarkar, Vijayalakshmi Natarajan

  • 出版商: Packt Publishing
  • 出版日期: 2019-01-30
  • 售價: $1,520
  • 貴賓價: 9.5$1,444
  • 語言: 英文
  • 頁數: 336
  • 裝訂: Paperback
  • ISBN: 1789136601
  • ISBN-13: 9781789136609
  • 相關分類: Python程式語言Machine Learning
  • 立即出貨 (庫存=1)

買這商品的人也買了...

商品描述

Implement machine learning algorithms to build ensemble models using Keras, H2O, Scikit-Learn, Pandas and more

Key Features

  • Apply popular machine learning algorithms using a recipe-based approach
  • Implement boosting, bagging, and stacking ensemble methods to improve machine learning models
  • Discover real-world ensemble applications and encounter complex challenges in Kaggle competitions

Book Description

Ensemble modeling is an approach used to improve the performance of machine learning models. It combines two or more similar or dissimilar machine learning algorithms to deliver superior intellectual powers. This book will help you to implement popular machine learning algorithms to cover different paradigms of ensemble machine learning such as boosting, bagging, and stacking.

The Ensemble Machine Learning Cookbook will start by getting you acquainted with the basics of ensemble techniques and exploratory data analysis. You'll then learn to implement tasks related to statistical and machine learning algorithms to understand the ensemble of multiple heterogeneous algorithms. It will also ensure that you don't miss out on key topics, such as like resampling methods. As you progress, you'll get a better understanding of bagging, boosting, stacking, and working with the Random Forest algorithm using real-world examples. The book will highlight how these ensemble methods use multiple models to improve machine learning results, as compared to a single model. In the concluding chapters, you'll delve into advanced ensemble models using neural networks, natural language processing, and more. You'll also be able to implement models such as fraud detection, text categorization, and sentiment analysis.

By the end of this book, you'll be able to harness ensemble techniques and the working mechanisms of machine learning algorithms to build intelligent models using individual recipes.

What you will learn

  • Understand how to use machine learning algorithms for regression and classification problems
  • Implement ensemble techniques such as averaging, weighted averaging, and max-voting
  • Get to grips with advanced ensemble methods, such as bootstrapping, bagging, and stacking
  • Use Random Forest for tasks such as classification and regression
  • Implement an ensemble of homogeneous and heterogeneous machine learning algorithms
  • Learn and implement various boosting techniques, such as AdaBoost, Gradient Boosting Machine, and XGBoost

Who this book is for

This book is designed for data scientists, machine learning developers, and deep learning enthusiasts who want to delve into machine learning algorithms to build powerful ensemble models. Working knowledge of Python programming and basic statistics is a must to help you grasp the concepts in the book.

Table of Contents

  1. Get Closer to Your Data with Exploratory Data Analysis
  2. Getting Started with Ensemble Machine Learning
  3. Resampling Methods
  4. Statistical & Machine Learning Algorithms
  5. Bag the Models with Bagging
  6. When in Doubt, use Random Forest
  7. Boost up Model Performance with Boosting
  8. Blend it with Stacking
  9. Homogeneous Ensemble for Hand-Written Digits Recognition
  10. Heterogeneous Ensemble Classifiers for Credit Card Default Prediction
  11. Heterogeneous Ensemble for Sentiment Analysis using NLP
  12. Heterogeneous Ensemble for Multi-Label Classification for Text Categorization

商品描述(中文翻譯)

使用Keras、H2O、Scikit-Learn、Pandas等實現機器學習算法,建立集成模型

主要特點



  • 使用基於配方的方法應用流行的機器學習算法

  • 實現提升、裝袋和堆疊集成方法,以改進機器學習模型

  • 探索現實世界中的集成應用,並在Kaggle競賽中遇到複雜挑戰

書籍描述


集成建模是一種用於改進機器學習模型性能的方法。它結合了兩個或多個相似或不相似的機器學習算法,以提供更優的智能能力。本書將幫助您實現流行的機器學習算法,涵蓋提升、裝袋和堆疊等不同範式的集成機器學習。

《集成機器學習食譜》將從讓您熟悉集成技術和探索性數據分析的基礎知識開始。然後,您將學習實現與統計和機器學習算法相關的任務,以了解多個異質算法的集成。它還將確保您不會錯過重要主題,例如重採樣方法。隨著進展,您將更好地了解裝袋、提升、堆疊以及使用真實世界示例使用隨機森林算法。本書將強調這些集成方法如何使用多個模型改進機器學習結果,相比單個模型。在結尾章節中,您將深入研究使用神經網絡、自然語言處理等高級集成模型。您還將能夠實現詐騙檢測、文本分類和情感分析等模型。

通過閱讀本書,您將能夠利用集成技術和機器學習算法的工作機制,使用個別的配方構建智能模型。

您將學到什麼



  • 了解如何使用機器學習算法解決回歸和分類問題

  • 實現平均、加權平均和最大投票等集成技術

  • 掌握引導、裝袋和堆疊等高級集成方法

  • 使用隨機森林進行分類和回歸等任務

  • 實現同質和異質機器學習算法的集成

  • 學習和實現各種提升技術,如AdaBoost、梯度提升機和XGBoost

本書適合人群


本書適合數據科學家、機器學習開發人員和深度學習愛好者,他們希望深入研究機器學習算法,以構建強大的集成模型。您需要具備Python編程和基本統計學的工作知識,以幫助您理解本書中的概念。

目錄



  1. 通過探索性數據分析更接近您的數據

  2. 開始使用集成機器學習

  3. 重採樣方法

  4. 統計和機器學習算法

  5. 使用裝袋模型

  6. 當您不確定時,使用隨機森林

  7. 使用提升方法提高模型性能

  8. 與堆疊相結合

  9. 同質集成用於手寫數字識別

  10. 異質集成分類器用於信用卡違約預測

  11. 使用NLP進行情感分析的異質集成

  12. 用於文本分類的多標籤分類的異質集成