Learn Pyspark: Build Python-Based Machine Learning and Deep Learning Models (Paperback)
暫譯: 學習 PySpark:構建基於 Python 的機器學習與深度學習模型 (平裝本)

Singh, Pramod

  • 出版商: Apress
  • 出版日期: 2019-09-07
  • 售價: $2,040
  • 貴賓價: 9.5$1,938
  • 語言: 英文
  • 頁數: 295
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1484249607
  • ISBN-13: 9781484249604
  • 相關分類: Python程式語言SparkMachine LearningDeepLearning
  • 海外代購書籍(需單獨結帳)

買這商品的人也買了...

相關主題

商品描述

Leverage machine and deep learning models to build applications on real-time data using PySpark. This book is perfect for those who want to learn to use this language to perform exploratory data analysis and solve an array of business challenges.
You'll start by reviewing PySpark fundamentals, such as Spark's core architecture, and see how to use PySpark for big data processing like data ingestion, cleaning, and transformations techniques. This is followed by building workflows for analyzing streaming data using PySpark and a comparison of various streaming platforms.
You'll then see how to schedule different spark jobs using Airflow with PySpark and book examine tuning machine and deep learning models for real-time predictions. This book concludes with a discussion on graph frames and performing network analysis using graph algorithms in PySpark. All the code presented in the book will be available in Python scripts on Github.
What You'll Learn

  • Develop pipelines for streaming data processing using PySpark
  • Build Machine Learning & Deep Learning models using PySpark latest offerings
  • Use graph analytics using PySpark
  • Create Sequence Embeddings from Text data

Who This Book is For

Data Scientists, machine learning and deep learning engineers who want to learn and use PySpark for real time analysis on streaming data.

商品描述(中文翻譯)

利用機器學習和深度學習模型,使用 PySpark 在即時數據上構建應用程式。本書非常適合那些希望學習如何使用這種語言進行探索性數據分析並解決各種商業挑戰的人。

您將首先回顧 PySpark 的基本概念,例如 Spark 的核心架構,並了解如何使用 PySpark 進行大數據處理,如數據攝取、清理和轉換技術。接下來,將構建用於分析流數據的工作流程,並比較各種流平台。

然後,您將看到如何使用 Airflow 與 PySpark 調度不同的 Spark 作業,並探討調整機器學習和深度學習模型以進行即時預測。本書最後將討論圖框架及使用圖算法在 PySpark 中執行網絡分析。本書中呈現的所有代碼將在 GitHub 上以 Python 腳本的形式提供。

您將學到什麼


  • 使用 PySpark 開發流數據處理的管道

  • 使用 PySpark 最新功能構建機器學習和深度學習模型

  • 使用 PySpark 進行圖形分析

  • 從文本數據創建序列嵌入

本書適合誰

本書適合希望學習並使用 PySpark 進行流數據即時分析的數據科學家、機器學習和深度學習工程師。

作者簡介

Pramod Singh is currently a Manager (Data Science) at Publicis Sapient and working as data science lead for a project with Mercedes Benz. He has spent the last nine years working on multiple Data projects at SapientRazorfish, Infosys & Tally and has used traditional to advanced machine learning and deep learning techniques in multiple projects using R, Python, Spark and Tensorflow. Pramod has also been a regular speaker at major conferences in India and abroad and is currently authoring a couple of books on Deep Learning and AI techniques. He regularly conducts Data Science meetups at SapientRazorfish and presents webinars on Machine Learning and Artificial Intelligence. He lives in Bangalore with his wife and 2-year-old son. In his spare time, he enjoys coding, reading and watching football.

作者簡介(中文翻譯)

Pramod Singh 目前是 Publicis Sapient 的數據科學經理,並擔任與 Mercedes Benz 合作的項目的數據科學負責人。他在 SapientRazorfish、Infosys 和 Tally 工作了九年,參與了多個數據項目,並在多個項目中使用了傳統到先進的機器學習和深度學習技術,使用的工具包括 R、Python、Spark 和 TensorFlow。Pramod 也經常在印度及國外的主要會議上發表演講,目前正在撰寫幾本有關深度學習和人工智慧技術的書籍。他定期在 SapientRazorfish 舉辦數據科學聚會,並就機器學習和人工智慧進行網路研討會。他與妻子和兩歲的兒子住在班加羅爾。在空閒時間,他喜歡編程、閱讀和觀看足球。