Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning
暫譯: 數據科學揭密:特徵工程、數據視覺化、管道開發與超參數調整

Nokeri, Tshepo Chris

  • 出版商: Apress
  • 出版日期: 2021-03-07
  • 售價: $2,020
  • 貴賓價: 9.5$1,919
  • 語言: 英文
  • 頁數: 252
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1484268695
  • ISBN-13: 9781484268698
  • 相關分類: Data-visualizationMachine Learning
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Get insight into data science techniques such as data engineering and visualization, statistical modeling, machine learning, and deep learning. This book teaches you how to select variables, optimize hyper parameters, develop pipelines, and train, test, and validate machine and deep learning models. Each chapter includes a set of examples allowing you to understand the concepts, assumptions, and procedures behind each model.

The book covers parametric methods or linear models that combat under- or over-fitting using techniques such as Lasso and Ridge. It includes complex regression analysis with time series smoothing, decomposition, and forecasting. It takes a fresh look at non-parametric models for binary classification (logistic regression analysis) and ensemble methods such as decision trees, support vector machines, and naive Bayes. It covers the most popular non-parametric method for time-event data (the Kaplan-Meier estimator). It also covers ways of solving classification problems using artificial neural networks such as restricted Boltzmann machines, multi-layer perceptrons, and deep belief networks. The book discusses unsupervised learning clustering techniques such as the K-means method, agglomerative and Dbscan approaches, and dimension reduction techniques such as Feature Importance, Principal Component Analysis, and Linear Discriminant Analysis. And it introduces driverless artificial intelligence using H2O.

After reading this book, you will be able to develop, test, validate, and optimize statistical machine learning and deep learning models, and engineer, visualize, and interpret sets of data.

What You Will Learn

  • Design, develop, train, and validate machine learning and deep learning models
  • Find optimal hyper parameters for superior model performance
  • Improve model performance using techniques such as dimension reduction and regularization
  • Extract meaningful insights for decision making using data visualization

Who This Book Is For
Beginning and intermediate level data scientists and machine learning engineers

 

商品描述(中文翻譯)

獲得有關數據科學技術的見解,例如數據工程和可視化、統計建模、機器學習和深度學習。本書教您如何選擇變數、優化超參數、開發管道,以及訓練、測試和驗證機器學習和深度學習模型。每一章都包含一組範例,讓您理解每個模型背後的概念、假設和程序。

本書涵蓋了參數方法或線性模型,使用 Lasso 和 Ridge 等技術來對抗過擬合或欠擬合。它包括複雜的回歸分析,涉及時間序列平滑、分解和預測。它對二元分類的非參數模型(邏輯回歸分析)和集成方法(如決策樹、支持向量機和朴素貝葉斯)進行了全新的探討。它涵蓋了時間事件數據中最受歡迎的非參數方法(Kaplan-Meier 估計量)。此外,它還介紹了使用人工神經網絡解決分類問題的方法,如限制玻爾茲曼機、多層感知器和深度信念網絡。本書討論了無監督學習的聚類技術,如 K-means 方法、聚合和 Dbscan 方法,以及降維技術,如特徵重要性、主成分分析和線性判別分析。並且介紹了使用 H2O 的無駕駛人工智慧。

閱讀本書後,您將能夠開發、測試、驗證和優化統計機器學習和深度學習模型,並工程化、可視化和解釋數據集。

您將學到的內容:

- 設計、開發、訓練和驗證機器學習和深度學習模型
- 尋找最佳超參數以獲得卓越的模型性能
- 使用降維和正則化等技術改善模型性能
- 通過數據可視化提取有意義的見解以進行決策

本書適合對象:
初學者和中級數據科學家及機器學習工程師

作者簡介

Tsheop Chris Nokeri harnesses advanced analytics and artificial intelligence to foster innovation and optimize business performance. He has delivered complex solutions to companies in the mining, petroleum, and manufacturing industries. He completed a bachelor's degree in information management and graduated with an honors degree in business science at the University of the Witwatersrand on a TATA Prestigious Scholarship and a Wits Postgraduate Merit Award. He also was awarded the Oxford University Press Prize.

作者簡介(中文翻譯)

Tsheop Chris Nokeri 利用先進的分析技術和人工智慧來促進創新並優化商業表現。他為礦業、石油和製造業的公司提供了複雜的解決方案。他在威特沃特斯蘭大學(University of the Witwatersrand)完成了資訊管理的學士學位,並以優異的成績獲得商業科學的榮譽學位,獲得了 TATA 傑出獎學金和威特沃特斯蘭大學研究生優異獎。他還獲得了牛津大學出版社獎。