Building Scalable Deep Learning Pipelines on AWS: Develop, Train, and Deploy Deep Learning Models
Testas, Abdelaziz
- 出版商: Apress
- 出版日期: 2025-01-06
- 售價: $1,890
- 貴賓價: 9.5 折 $1,796
- 語言: 英文
- 頁數: 390
- 裝訂: Quality Paper - also called trade paper
- ISBN: 9798868810169
- ISBN-13: 9798868810169
-
相關分類:
Amazon Web Services、JVM 語言、DeepLearning
尚未上市,無法訂購
相關主題
商品描述
This book is your comprehensive guide to creating powerful, end-to-end deep learning workflows on Amazon Web Services (AWS). The book explores how to integrate essential big data tools and technologies--such as PySpark, PyTorch, TensorFlow, Airflow, EC2, and S3--to streamline the development, training, and deployment of deep learning models.
Starting with the importance of scaling advanced machine learning models, this book leverages AWS's robust infrastructure and comprehensive suite of services. It guides you through the setup and configuration needed to maximize the potential of deep learning technologies. You will gain in-depth knowledge of building deep learning pipelines, including data preprocessing, feature engineering, model training, evaluation, and deployment.
The book provides insights into setting up an AWS environment, configuring necessary tools, and using PySpark for distributed data processing. You will also delve into hands-on tutorials for PyTorch and TensorFlow, mastering their roles in building and training neural networks. Additionally, you will learn how Apache Airflow can orchestrate complex workflows and how Amazon S3 and EC2 enhance model deployment at scale.
By the end of this book, you will be equipped to tackle real-world challenges and seize opportunities in the rapidly evolving field of deep learning with AWS. You will gain the insights and skills needed to drive innovation and maintain a competitive edge in today's data-driven landscape.
What You Will Learn
- Maximize AWS services for scalable and high-performance deep learning architectures
- Harness the capacity of PyTorch and TensorFlow for advanced neural network development
- Utilize PySpark for efficient distributed data processing on AWS
- Orchestrate complex workflows with Apache Airflow for seamless data processing, model training, and deployment
Who This Book Is For
Data scientists looking to expand their skill set to include deep learning on AWS, machine learning engineers tasked with designing and deploying machine learning systems who want to incorporate deep learning capabilities into their applications, AI practitioners working across various industries who seek to leverage deep learning for solving complex problems and gaining a competitive advantage
商品描述(中文翻譯)
這本書是您在 Amazon Web Services (AWS) 上創建強大端到端深度學習工作流程的全面指南。本書探討如何整合必要的大數據工具和技術,例如 PySpark、PyTorch、TensorFlow、Airflow、EC2 和 S3,以簡化深度學習模型的開發、訓練和部署。
本書從擴展先進機器學習模型的重要性開始,利用 AWS 強大的基礎設施和全面的服務套件。它指導您完成設置和配置,以最大化深度學習技術的潛力。您將深入了解構建深度學習管道的過程,包括數據預處理、特徵工程、模型訓練、評估和部署。
本書提供有關設置 AWS 環境、配置必要工具以及使用 PySpark 進行分散式數據處理的見解。您還將深入學習 PyTorch 和 TensorFlow 的實作教程,掌握它們在構建和訓練神經網絡中的角色。此外,您將學習如何使用 Apache Airflow 來協調複雜的工作流程,以及 Amazon S3 和 EC2 如何提升大規模模型部署的效率。
在本書結束時,您將具備應對現實世界挑戰的能力,並在快速發展的深度學習領域中把握機會。您將獲得推動創新和在當今數據驅動的環境中保持競爭優勢所需的見解和技能。
您將學到的內容:
- 最大化 AWS 服務以實現可擴展和高效能的深度學習架構
- 利用 PyTorch 和 TensorFlow 的能力進行先進的神經網絡開發
- 在 AWS 上有效利用 PySpark 進行分散式數據處理
- 使用 Apache Airflow 協調複雜的工作流程,以實現無縫的數據處理、模型訓練和部署
本書適合對象:
希望擴展技能以包括 AWS 深度學習的數據科學家、負責設計和部署機器學習系統的機器學習工程師,想要將深度學習能力納入其應用中的 AI 從業者,以及希望利用深度學習解決複雜問題並獲得競爭優勢的各行各業的專業人士。
作者簡介
Abdelaziz Testas, PhD, is a seasoned data scientist with over a decade of experience in data analysis and machine learning. He earned his PhD in Economics from the University of Leeds in England and holds a master's degree in the same field from the University of Glasgow in Scotland. Additionally, he has earned several certifications in computer science and data science in the United States.
For over 10 years, Abdelaziz served as a Lead Data Scientist at Nielsen, where he played a pivotal role in enhancing the company's audience measurement capabilities. He was instrumental in planning, initiating, and executing end-to-end data science projects and developing methodologies that advanced Nielsen's digital ad and content rating products. His expertise in media measurement and data science drove the creation of innovative solutions.
Recently, Abdelaziz transitioned to the public sector, joining the State of California's Department of Health Care Access and Information (HCAI). In his new role, he leverages his coding and data science leadership skills to make a meaningful impact, supporting HCAI's mission to ensure quality, equitable, and affordable health care for all Californians.
Abdelaziz is also the author of Distributed Machine Learning with PySpark: Migrating Effortlessly from Pandas and Scikit-Learn (Apress).
作者簡介(中文翻譯)
Abdelaziz Testas, PhD,是一位經驗豐富的數據科學家,擁有超過十年的數據分析和機器學習經驗。他在英國利茲大學獲得經濟學博士學位,並在蘇格蘭格拉斯哥大學獲得同一領域的碩士學位。此外,他還在美國獲得了多項計算機科學和數據科學的認證。
在過去的十多年中,Abdelaziz擔任尼爾森的首席數據科學家,對提升公司的受眾測量能力發揮了關鍵作用。他在規劃、啟動和執行端到端的數據科學項目以及開發推進尼爾森數位廣告和內容評級產品的方法論方面發揮了重要作用。他在媒體測量和數據科學方面的專業知識促進了創新解決方案的產生。
最近,Abdelaziz轉向公共部門,加入加州健康照護獲取與資訊部(HCAI)。在新的角色中,他利用自己的編程和數據科學領導技能,為HCAI的使命做出有意義的貢獻,支持確保所有加州人都能獲得高品質、公平和可負擔的醫療保健。
Abdelaziz也是《Distributed Machine Learning with PySpark: Migrating Effortlessly from Pandas and Scikit-Learn》(Apress)的作者。