Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications (Addison-Wesley Data & Analytics Series)

Andrew Kelleher, Adam Kelleher

買這商品的人也買了...

商品描述

Foundational Hands-On Skills for Succeeding with Real Data Science Projects

 

 

This pragmatic book introduces both machine learning and data science, bridging gaps between data scientist and engineer, and helping you bring these techniques into production. It helps ensure that your efforts actually solve your problem, and offers unique coverage of real-world optimization in production settings.

–From the Foreword by Paul Dix, series editor

 

Machine Learning in Production is a crash course in data science and machine learning for people who need to solve real-world problems in production environments. Written for technically competent “accidental data scientists” with more curiosity and ambition than formal training, this complete and rigorous introduction stresses practice, not theory.

 

Building on agile principles, Andrew and Adam Kelleher show how to quickly deliver significant value in production, resisting overhyped tools and unnecessary complexity. Drawing on their extensive experience, they help you ask useful questions and then execute production projects from start to finish.

 

The authors show just how much information you can glean with straightforward queries, aggregations, and visualizations, and they teach indispensable error analysis methods to avoid costly mistakes. They turn to workhorse machine learning techniques such as linear regression, classification, clustering, and Bayesian inference, helping you choose the right algorithm for each production problem. Their concluding section on hardware, infrastructure, and distributed systems offers unique and invaluable guidance on optimization in production environments.

 

Andrew and Adam always focus on what matters in production: solving the problems that offer the highest return on investment, using the simplest, lowest-risk approaches that work.

  • Leverage agile principles to maximize development efficiency in production projects
  • Learn from practical Python code examples and visualizations that bring essential algorithmic concepts to life
  • Start with simple heuristics and improve them as your data pipeline matures
  • Avoid bad conclusions by implementing foundational error analysis techniques
  • Communicate your results with basic data visualization techniques
  • Master basic machine learning techniques, starting with linear regression and random forests
  • Perform classification and clustering on both vector and graph data
  • Learn the basics of graphical models and Bayesian inference
  • Understand correlation and causation in machine learning models
  • Explore overfitting, model capacity, and other advanced machine learning techniques
  • Make informed architectural decisions about storage, data transfer, computation, and communication

Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.

商品描述(中文翻譯)

《實戰數據科學專案的基礎實踐技能》

這本務實的書籍介紹了機器學習和數據科學,填補了數據科學家和工程師之間的差距,幫助你將這些技術應用於實際生產中。它確保你的努力真正解決了你的問題,並提供了關於生產環境中真實優化的獨特覆蓋範圍。

「這本書的序言中,Paul Dix(系列編輯)寫道:」

《機器學習在生產中》是一門關於數據科學和機器學習的速成課程,針對那些需要在生產環境中解決實際問題的人。這本完整而嚴謹的入門書籍針對具有技術能力但缺乏正式培訓的「意外數據科學家」,強調實踐而非理論。

基於敏捷原則,Andrew和Adam Kelleher展示了如何在生產中快速交付顯著價值,抵制過度宣傳的工具和不必要的複雜性。他們借鑒自己的豐富經驗,幫助你提出有用的問題,然後從頭到尾執行生產專案。

作者們展示了你可以通過簡單的查詢、聚合和可視化獲得多少信息,並教授了不可或缺的錯誤分析方法,以避免昂貴的錯誤。他們轉向基本的機器學習技術,如線性回歸、分類、聚類和貝葉斯推斷,幫助你為每個生產問題選擇合適的算法。他們關於硬件、基礎設施和分佈式系統的結尾部分提供了獨特而寶貴的優化指導,適用於生產環境。

Andrew和Adam始終關注生產中的重點:解決能夠帶來最高投資回報的問題,使用最簡單、風險最低的方法。

- 利用敏捷原則在生產專案中最大化開發效率
- 通過實際的Python代碼示例和可視化來理解基本的算法概念
- 從簡單的啟發式方法開始,隨著數據流程的成熟而改進
- 通過實施基礎的錯誤分析技術避免錯誤結論
- 使用基本的數據可視化技術傳達你的結果
- 掌握基本的機器學習技術,從線性回歸和隨機森林開始
- 在向量和圖形數據上進行分類和聚類
- 學習圖形模型和貝葉斯推斷的基礎知識
- 理解機器學習模型中的相關性和因果關係
- 探索過度擬合、模型容量和其他高級機器學習技術
- 對存儲、數據傳輸、計算和通信做出明智的架構決策

「在書中內部查看詳細信息,註冊以便方便地獲取下載、更新和/或更正。」