scikit-learn : Machine Learning Simplified: Implement scikit-learn into every step of the data science pipeline

Raul Garreta, Guillermo Moncecchi, Trent Hauck, Gavin Hackeling

  • 出版商: Packt Publishing
  • 出版日期: 2017-11-17
  • 售價: $3,630
  • 貴賓價: 9.5$3,449
  • 語言: 英文
  • 頁數: 530
  • 裝訂: Paperback
  • ISBN: 1788833473
  • ISBN-13: 9781788833479
  • 相關分類: Machine LearningData Science
  • 下單後立即進貨 (約3~4週)

商品描述

Implement scikit-learn into every step of the data science pipeline

About This Book

  • Use Python and scikit-learn to create intelligent applications
  • Discover how to apply algorithms in a variety of situations to tackle common and not-so common challenges in the machine learning domain
  • A practical, example-based guide to help you gain expertise in implementing and evaluating machine learning systems using scikit-learn

Who This Book Is For

If you are a programmer and want to explore machine learning and data-based methods to build intelligent applications and enhance your programming skills, this is the course for you. No previous experience with machine-learning algorithms is required.

What You Will Learn

  • Review fundamental concepts including supervised and unsupervised experiences, common tasks, and performance metrics
  • Classify objects (from documents to human faces and flower species) based on some of their features, using a variety of methods from Support Vector Machines to Naive Bayes
  • Use Decision Trees to explain the main causes of certain phenomena such as passenger survival on the Titanic
  • Evaluate the performance of machine learning systems in common tasks
  • Master algorithms of various levels of complexity and learn how to analyze data at the same time
  • Learn just enough math to think about the connections between various algorithms
  • Customize machine learning algorithms to fit your problem, and learn how to modify them when the situation calls for it
  • Incorporate other packages from the Python ecosystem to munge and visualize your dataset
  • Improve the way you build your models using parallelization techniques

In Detail

Machine learning, the art of creating applications that learn from experience and data, has been around for many years. Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility; moreover, within the Python data space, scikit-learn is the unequivocal choice for machine learning. The course combines an introduction to some of the main concepts and methods in machine learning with practical, hands-on examples of real-world problems. The course starts by walking through different methods to prepare your data―be it a dataset with missing values or text columns that require the categories to be turned into indicator variables. After the data is ready, you'll learn different techniques aligned with different objectives―be it a dataset with known outcomes such as sales by state, or more complicated problems such as clustering similar customers. Finally, you'll learn how to polish your algorithm to ensure that it's both accurate and resilient to new datasets. You will learn to incorporate machine learning in your applications. Ranging from handwritten digit recognition to document classification, examples are solved step-by-step using scikit-learn and Python. By the end of this course you will have learned how to build applications that learn from experience, by applying the main concepts and techniques of machine learning.

Style and Approach

Implement scikit-learn using engaging examples and fun exercises, and with a gentle and friendly but comprehensive "learn-by-doing" approach. This is a practical course, which analyzes compelling data about life, health, and death with the help of tutorials. It offers you a useful way of interpreting the data that's specific to this course, but that can also be applied to any other data. This course is designed to be both a guide and a reference for moving beyond the basics of scikit-learn.

商品描述(中文翻譯)

實現 scikit-learn 於數據科學流程的每一個步驟

關於本書
- 使用 Python 和 scikit-learn 創建智能應用程式
- 探索如何在各種情況下應用算法,以解決機器學習領域中的常見及不常見挑戰
- 一本以實例為基礎的實用指南,幫助您在使用 scikit-learn 實施和評估機器學習系統方面獲得專業知識

本書適合誰
如果您是一名程式設計師,想要探索機器學習和基於數據的方法來構建智能應用程式並提升您的程式設計技能,那麼這門課程適合您。無需具備機器學習算法的先前經驗。

您將學到什麼
- 回顧基本概念,包括監督式和非監督式經驗、常見任務和性能指標
- 根據某些特徵對物件(從文件到人臉和花卉物種)進行分類,使用從支持向量機到朴素貝葉斯的各種方法
- 使用決策樹解釋某些現象的主要原因,例如泰坦尼克號乘客的生存情況
- 評估機器學習系統在常見任務中的性能
- 精通各種複雜程度的算法,並學習如何同時分析數據
- 學習足夠的數學以思考各種算法之間的聯繫
- 自訂機器學習算法以適應您的問題,並學習在情況需要時如何修改它們
- 整合 Python 生態系統中的其他套件來處理和可視化您的數據集
- 改善您使用平行化技術構建模型的方式

詳細內容
機器學習,即創建能夠從經驗和數據中學習的應用程式的藝術,已經存在多年。由於其簡單性和靈活性,Python 正迅速成為分析師和數據科學家的首選語言;此外,在 Python 數據領域中,scikit-learn 是機器學習的無可爭議的選擇。本課程結合了機器學習中一些主要概念和方法的介紹,以及針對現實世界問題的實用實例。課程開始時將介紹不同的方法來準備您的數據——無論是缺失值的數據集,還是需要將類別轉換為指標變數的文本列。在數據準備好之後,您將學習與不同目標相對應的不同技術——無論是已知結果的數據集(例如按州的銷售),還是更複雜的問題(例如聚類相似的客戶)。最後,您將學習如何完善您的算法,以確保其在新數據集上既準確又具韌性。您將學會在應用程式中整合機器學習。從手寫數字識別到文檔分類,示例將逐步使用 scikit-learn 和 Python 解決。到課程結束時,您將學會如何構建能夠從經驗中學習的應用程式,並應用機器學習的主要概念和技術。

風格與方法
使用引人入勝的示例和有趣的練習來實現 scikit-learn,並採用溫和友好但全面的「實踐中學習」方法。這是一門實用課程,通過教程分析有關生活、健康和死亡的引人注目的數據。它為您提供了一種有用的數據解釋方式,這種方式特定於本課程,但也可以應用於任何其他數據。本課程旨在成為超越 scikit-learn 基礎的指南和參考。