Data Engineering with Google Cloud Platform - Second Edition: A guide to leveling up as a data engineer by building a scalable data platform with Goog

Wijaya, Adi

  • 出版商: Packt Publishing
  • 出版日期: 2024-04-30
  • 售價: $1,640
  • 貴賓價: 9.5$1,558
  • 語言: 英文
  • 頁數: 476
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1835080111
  • ISBN-13: 9781835080115
  • 相關分類: Google CloudJVM 語言
  • 立即出貨 (庫存=1)

相關主題

商品描述

Become a successful data engineer by building and deploying your own data pipelines on Google Cloud, including making key architectural decisions

Key Features
  • Get up to speed with data governance on Google Cloud
  • Learn how to use various Google Cloud products like Dataform, DLP, Dataplex, Dataproc Serverless, and Datastream
  • Boost your confidence by getting Google Cloud data engineering certification guidance from real exam experiences
  • Purchase of the print or Kindle book includes a free PDF eBook
Book Description

The second edition of Data Engineering with Google Cloud builds upon the success of the first edition by offering enhanced clarity and depth to data professionals navigating the intricate landscape of data engineering. Beyond its foundational lessons, this new edition delves into the essential realm of data governance within Google Cloud, providing you invaluable insights into managing and optimizing data resources effectively. Furthermore, this book helps you stay ahead of the curve by guiding you through the latest technological advancements in the Google Cloud ecosystem. You'll cover essential aspects, from exploring Cloud Composer 2 to the evolution of Airflow 2.5. Additionally, you'll explore how to work with cutting-edge tools like Dataform, DLP, Dataplex, Dataproc Serverless, and Datastream to perform data governance on datasets. By the end of this book, you'll be equipped to navigate the ever-evolving world of data engineering on Google Cloud, from foundational principles to cutting-edge practices.

What you will learn
  • Load data into BigQuery and materialize its output
  • Focus on data pipeline orchestration using Cloud Composer
  • Formulate Airflow jobs to orchestrate and automate a data warehouse
  • Establish a Hadoop data lake, generate ephemeral clusters, and execute jobs on the Dataproc cluster
  • Harness Pub/Sub for messaging and ingestion for event-driven systems
  • Apply Dataflow to conduct ETL on streaming data
  • Implement data governance services on Google Cloud
Who this book is for

Data analysts, IT practitioners, software engineers, or any data enthusiasts looking to have a successful data engineering career will find this book invaluable. Additionally, experienced data professionals who want to start using Google Cloud to build data platforms will get clear insights on how to navigate the path. Whether you're a beginner who wants to explore the fundamentals or a seasoned professional seeking to learn the latest data engineering concepts, this book is for you.

Table of Contents
  1. Fundamentals of Data engineering with GCP
  2. Big Data Capabilities on GCP
  3. Building a data warehouse in BigQuery
  4. Build Orchestration for Batch Data Loading Using Cloud Composer
  5. Building a Data Lake using Dataproc
  6. Process Streaming Data with Datastream, Pub/Sub and Dataflow
  7. Visualizing Data for Making Data-Driven Decisions with Looker Studio
  8. Build machine learning solutions on GCP
  9. User and Project Management on GCP
  10. Data Governance in GCP
  11. Cost Strategy in GCP
  12. CI/CD on Google Cloud Platform for Data Engineers
  13. Boost your confidence as a Data Engineer

商品描述(中文翻譯)

成為一位成功的資料工程師,透過在Google Cloud上建立和部署自己的資料管道,包括做出關鍵的架構決策。

主要特點:
- 熟悉Google Cloud上的資料治理
- 學習如何使用各種Google Cloud產品,如Dataform、DLP、Dataplex、Dataproc Serverless和Datastream
- 透過真實考試經驗,獲得Google Cloud資料工程認證的指導,提升信心
- 購買印刷版或Kindle電子書,附贈免費PDF電子書

書籍描述:
《Data Engineering with Google Cloud》第二版在第一版的成功基礎上,為資料專業人士提供更清晰和更深入的資料工程知識,幫助他們在複雜的資料工程領域中導航。除了基礎課程外,這本新版書籍還深入探討了Google Cloud內的資料治理,為您提供管理和優化資料資源的寶貴見解。此外,本書還通過指導您使用Google Cloud生態系統中的最新技術進展,幫助您保持領先。您將涵蓋從探索Cloud Composer 2到Airflow 2.5演進的重要方面。此外,您還將探索如何使用Dataform、DLP、Dataplex、Dataproc Serverless和Datastream等尖端工具對數據集進行資料治理。通過閱讀本書,您將具備在Google Cloud上導航不斷發展的資料工程世界所需的能力,從基礎原則到尖端實踐。

學到的內容:
- 將數據加載到BigQuery並實現輸出
- 重點關注使用Cloud Composer進行資料管道編排
- 制定Airflow作業以編排和自動化數據倉庫
- 建立Hadoop數據湖,生成臨時集群並在Dataproc集群上執行作業
- 利用Pub/Sub進行消息傳遞和事件驅動系統的輸入
- 使用Dataflow對流式數據進行ETL
- 在Google Cloud上實施資料治理服務

適合閱讀對象:
本書對於資料分析師、IT從業人員、軟體工程師或任何熱愛資料的人士,尋求成功的資料工程職業生涯的人來說都是寶貴的資源。此外,有經驗的資料專業人士想要開始使用Google Cloud建立資料平台,將獲得如何順利運用的清晰見解。無論您是初學者想要探索基礎知識,還是經驗豐富的專業人士想要學習最新的資料工程概念,本書都適合您。

目錄:
1. GCP資料工程基礎
2. GCP上的大數據能力
3. 在BigQuery中建立資料倉庫
4. 使用Cloud Composer編排批量數據加載
5. 使用Dataproc建立數據湖
6. 使用Datastream、Pub/Sub和Dataflow處理流式數據
7. 使用Looker Studio視覺化數據以做出數據驅動的決策
8. 在GCP上構建機器學習解決方案
9. GCP上的用戶和項目管理
10. GCP中的資料治理
11. GCP中的成本策略
12. 資料工程師的Google Cloud平台CI/CD
13. 提升作為資料工程師的信心