Scikit-Learn Cookbook

Trent Hauck

買這商品的人也買了...

商品描述

Over 50 recipes to incorporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation

About This Book

  • Learn how to handle a variety of tasks with Scikit-Learn with interesting recipes that show you how the library really works
  • Use Scikit-Learn to simplify the programming side data so you can focus on thinking
  • Discover how to apply algorithms in a variety of situations

Who This Book Is For

If you're a data scientist already familiar with Python but not Scikit-Learn, or are familiar with other programming languages like R and want to take the plunge with the gold standard of Python machine learning libraries, then this is the book for you.

What You Will Learn

  • Address algorithms of various levels of complexity and learn how to analyze data at the same time
  • Handle common data problems such as feature extraction and missing data
  • Understand how to evaluate your models against themselves and any other model
  • Discover just enough math needed to learn how to think about the connections between various algorithms
  • Customize the machine learning algorithm to fit your problem, and learn how to modify it when the situation calls for it
  • Incorporate other packages from the Python ecosystem to munge and visualize your dataset

In Detail

Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. Its consistent API and plethora of features help solve any machine learning problem it comes across.

The book starts by walking through different methods to prepare your data—be it a dataset with missing values or text columns that require the categories to be turned into indicator variables. After the data is ready, you'll learn different techniques aligned with different objectives—be it a dataset with known outcomes such as sales by state, or more complicated problems such as clustering similar customers. Finally, you'll learn how to polish your algorithm to ensure that it's both accurate and resilient to new datasets.

商品描述(中文翻譯)

超過50個食譜,將scikit-learn融入到數據科學流程的每一個步驟,從特徵提取到模型建立和模型評估。

關於本書:
- 通過有趣的食譜,學習如何使用Scikit-Learn處理各種任務,展示該庫的實際運作方式。
- 使用Scikit-Learn簡化數據的編程,讓您專注於思考。
- 發現如何在各種情況下應用算法。

本書適合對Python已經熟悉但不熟悉Scikit-Learn的數據科學家,或者熟悉其他編程語言(如R)並希望使用Python機器學習庫的人。

您將學到什麼:
- 同時處理不同複雜程度的算法,並學習如何分析數據。
- 處理常見的數據問題,如特徵提取和缺失數據。
- 瞭解如何將模型與自身和其他模型進行評估。
- 掌握足夠的數學知識,以了解各種算法之間的聯繫。
- 自定義機器學習算法以解決您的問題,並在需要時進行修改。
- 將Python生態系統中的其他套件整合到您的數據集中進行整理和可視化。

詳細內容:
由於其簡單性和靈活性,Python迅速成為分析師和數據科學家的首選語言,在Python數據領域中,scikit-learn是無可爭議的Python機器學習庫。其一致的API和豐富的功能有助於解決任何機器學習問題。

本書首先介紹了不同的方法來準備數據,無論是具有缺失值的數據集還是需要將文本列轉換為指示變量的類別。數據準備好後,您將學習與不同目標相關的不同技術,例如具有已知結果(例如按州銷售)的數據集,或者更複雜的問題,例如聚類相似的客戶。最後,您將學習如何優化算法,以確保其準確性並適應新的數據集。