Data Science on the Google Cloud Platform: Implementing End-To-End Real-Time Data Pipelines: From Ingest to Machine Learning, 2/e (Paperback)

Lakshmanan, Valliappa

  • 出版商: O'Reilly
  • 出版日期: 2022-05-03
  • 定價: $2,450
  • 售價: 9.0$2,205
  • 語言: 英文
  • 頁數: 446
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1098118952
  • ISBN-13: 9781098118952
  • 相關分類: Google CloudMachine LearningData Science
  • 立即出貨 (庫存 < 3)

買這商品的人也買了...

商品描述

Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP.

Through the course of this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.

You'll learn how to:

  • Employ best practices in building highly scalable data and ML pipelines on Google Cloud
  • Automate and schedule data ingest using Cloud Run
  • Create and populate a dashboard in Data Studio
  • Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
  • Conduct interactive data exploration with BigQuery
  • Create a Bayesian model with Spark on Cloud Dataproc
  • Forecast time series and do anomaly detection with BigQuery ML
  • Aggregate within time windows with Dataflow
  • Train explainable machine learning models with Vertex AI
  • Operationalize ML with Vertex AI Pipelines

商品描述(中文翻譯)

學習如何在使用Google Cloud Platform (GCP) 構建時,將複雜的統計和機器學習方法應用於實際問題是多麼容易。這本實用指南向數據工程師和數據科學家展示了如何在GCP上實施端到端的數據流程,使用統計和機器學習方法和工具。

通過這本更新的第二版,您將通過採用各種數據科學方法來實現一個示例業務決策。在GCP上實施這些統計和機器學習解決方案,並發現這個平台提供了一種轉型和更具協作性的數據科學方式。

您將學習以下內容:

- 在Google Cloud上構建高度可擴展的數據和機器學習流程的最佳實踐
- 使用Cloud Run自動化和計劃數據輸入
- 在Data Studio中創建和填充儀表板
- 使用Pub/Sub、Dataflow和BigQuery構建實時分析流程
- 使用BigQuery進行交互式數據探索
- 在Cloud Dataproc上使用Spark創建貝葉斯模型
- 使用BigQuery ML預測時間序列和進行異常檢測
- 使用Dataflow在時間窗口內進行聚合
- 使用Vertex AI訓練可解釋的機器學習模型
- 使用Vertex AI Pipelines將機器學習操作化

作者簡介

Valliappa (Lak) Lakshmanan is the director of analytics and AI solutions at Google Cloud, where he leads a team building cross-industry solutions to business problems. His mission is to democratize machine learning so that it can be done by anyone anywhere. Lak is the author or coauthor of Practical Machine Learning for Computer Vision, Machine Learning Design Patterns, Data Governance The Definitive Guide, Google BigQuery The Definitive Guide, and Data Science on the Google Cloud Platform.

作者簡介(中文翻譯)

Valliappa (Lak) Lakshmanan 是 Google Cloud 的分析和人工智慧解決方案總監,他帶領一個團隊開發跨行業的解決方案來解決商業問題。他的使命是使機器學習民主化,讓任何人在任何地方都能進行機器學習。Lak 是 Practical Machine Learning for Computer Vision、Machine Learning Design Patterns、Data Governance The Definitive Guide、Google BigQuery The Definitive Guide 和 Data Science on the Google Cloud Platform 的作者或合著者。