Machine Learning for Imbalanced Data: Tackle imbalanced datasets using machine learning and deep learning techniques

Abhishek, Kumar, Abdelaziz, Mounir

  • 出版商: Packt Publishing
  • 出版日期: 2023-11-30
  • 售價: $1,950
  • 貴賓價: 9.5$1,853
  • 語言: 英文
  • 頁數: 344
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1801070830
  • ISBN-13: 9781801070836
  • 相關分類: Machine LearningDeepLearning
  • 立即出貨 (庫存=1)

買這商品的人也買了...

商品描述

Take your machine learning expertise to the next level with this essential guide, utilizing libraries like imbalanced-learn, PyTorch, scikit-learn, pandas, and NumPy to maximize model performance and tackle imbalanced data


Key Features:


  • Understand how to use modern machine learning frameworks with detailed explanations, illustrations, and code samples
  • Learn cutting-edge deep learning techniques to overcome data imbalance
  • Explore different methods for dealing with skewed data in ML and DL applications
  • Purchase of the print or Kindle book includes a free eBook in the PDF format


Book Description:


As machine learning practitioners, we often encounter imbalanced datasets in which one class has considerably fewer instances than the other. Many machine learning algorithms assume an equilibrium between majority and minority classes, leading to suboptimal performance on imbalanced data. This comprehensive guide helps you address this class imbalance to significantly improve model performance.


Machine Learning for Imbalanced Data begins by introducing you to the challenges posed by imbalanced datasets and the importance of addressing these issues. It then guides you through techniques that enhance the performance of classical machine learning models when using imbalanced data, including various sampling and cost-sensitive learning methods.


As you progress, you'll delve into similar and more advanced techniques for deep learning models, employing PyTorch as the primary framework. Throughout the book, hands-on examples will provide working and reproducible code that'll demonstrate the practical implementation of each technique.


By the end of this book, you'll be adept at identifying and addressing class imbalances and confidently applying various techniques, including sampling, cost-sensitive techniques, and threshold adjustment, while using traditional machine learning or deep learning models.


What You Will Learn:


  • Use imbalanced data in your machine learning models effectively
  • Explore the metrics used when classes are imbalanced
  • Understand how and when to apply various sampling methods such as over-sampling and under-sampling
  • Apply data-based, algorithm-based, and hybrid approaches to deal with class imbalance
  • Combine and choose from various options for data balancing while avoiding common pitfalls
  • Understand the concepts of model calibration and threshold adjustment in the context of dealing with imbalanced datasets


Who this book is for:


This book is for machine learning practitioners who want to effectively address the challenges of imbalanced datasets in their projects. Data scientists, machine learning engineers/scientists, research scientists/engineers, and data scientists/engineers will find this book helpful. Though complete beginners are welcome to read this book, some familiarity with core machine learning concepts will help readers maximize the benefits and insights gained from this comprehensive resource.

商品描述(中文翻譯)

將您提供的文字翻譯成繁體中文如下:

將您的機器學習專業知識提升到更高的水平,利用 imbalanced-learn、PyTorch、scikit-learn、pandas 和 NumPy 等庫來最大化模型性能並應對不平衡的數據。

主要特點:
- 通過詳細的解釋、插圖和代碼示例,了解如何使用現代機器學習框架
- 學習尖端的深度學習技術,克服數據不平衡問題
- 探索處理機器學習和深度學習應用中的偏斜數據的不同方法
- 購買印刷版或 Kindle 版本的書籍,將獲得免費的 PDF 電子書

書籍描述:
作為機器學習從業者,我們常常遇到一類樣本遠少於另一類的不平衡數據集。許多機器學習算法假設多數類和少數類之間存在平衡,這導致在不平衡數據上性能不佳。這本全面的指南將幫助您解決這種類別不平衡問題,從而顯著提高模型性能。

《機器學習與不平衡數據》首先介紹了不平衡數據集帶來的挑戰以及解決這些問題的重要性。然後,它引導您通過各種採樣和成本敏感學習方法來增強使用不平衡數據時傳統機器學習模型的性能。

隨著學習的進展,您將深入研究使用 PyTorch 作為主要框架的深度學習模型的相似和更高級的技術。整本書中,實際示例將提供可工作且可重現的代碼,演示每種技術的實際實施。

通過閱讀本書,您將能夠熟練識別和解決類別不平衡問題,並在使用傳統機器學習或深度學習模型時自信地應用各種技術,包括採樣、成本敏感技術和閾值調整。

學到的內容:
- 有效地在機器學習模型中使用不平衡數據
- 探索在類別不平衡時使用的度量標準
- 理解何時以及如何應用各種採樣方法,如過採樣和欠採樣
- 應用基於數據、基於算法和混合方法來處理類別不平衡
- 結合並選擇各種數據平衡選項,同時避免常見問題
- 在處理不平衡數據集時,理解模型校準和閾值調整的概念

本書適合對於在項目中有效解決不平衡數據集挑戰的機器學習從業者。數據科學家、機器學習工程師/科學家、研究科學家/工程師以及數據科學家/工程師都會發現本書有用。雖然初學者也可以閱讀本書,但對核心機器學習概念的一些熟悉將有助於讀者從這本全面的資源中獲得更多的益處和見解。