Applied Machine Learning and High-Performance Computing on AWS: Accelerate the development of machine learning applications following architectural be

Khanuja, Mani, Sabir, Farooq, Subramanian, Shreyas

  • 出版商: Packt Publishing
  • 出版日期: 2022-12-30
  • 售價: $1,810
  • 貴賓價: 9.5$1,720
  • 語言: 英文
  • 頁數: 382
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1803237015
  • ISBN-13: 9781803237015
  • 相關分類: Amazon Web ServicesMachine Learning
  • 下單後立即進貨 (約3~4週)

商品描述

Build, train, and deploy large machine learning models at scale in various domains such as computational fluid dynamics, genomics, autonomous vehicles, and numerical optimization using Amazon SageMaker

Key Features

- Understand the need for high-performance computing (HPC)
- Build, train, and deploy large ML models with billions of parameters using Amazon SageMaker
- Learn best practices and architectures for implementing ML at scale using HPC

Book Description

Machine learning (ML) and high-performance computing (HPC) on AWS run compute-intensive workloads across industries and emerging applications. Its use cases can be linked to various verticals, such as computational fluid dynamics (CFD), genomics, and autonomous vehicles.

This book provides end-to-end guidance, starting with HPC concepts for storage and networking. It then progresses to working examples on how to process large datasets using SageMaker Studio and EMR. Next, you'll learn how to build, train, and deploy large models using distributed training. Later chapters also guide you through deploying models to edge devices using SageMaker and IoT Greengrass, and performance optimization of ML models, for low latency use cases.

By the end of this book, you'll be able to build, train, and deploy your own large-scale ML application, using HPC on AWS, following industry best practices and addressing the key pain points encountered in the application life cycle.

What you will learn

- Explore data management, storage, and fast networking for HPC applications
- Focus on the analysis and visualization of a large volume of data using Spark
- Train visual transformer models using SageMaker distributed training
- Deploy and manage ML models at scale on the cloud and at the edge
- Get to grips with performance optimization of ML models for low latency workloads
- Apply HPC to industry domains such as CFD, genomics, AV, and optimization

Who this book is for

The book begins with HPC concepts, however, it expects you to have prior machine learning knowledge. This book is for ML engineers and data scientists interested in learning advanced topics on using large datasets for training large models using distributed training concepts on AWS, deploying models at scale, and performance optimization for low latency use cases. Practitioners in fields such as numerical optimization, computation fluid dynamics, autonomous vehicles, and genomics, who require HPC for applying ML models to applications at scale will also find the book useful.

商品描述(中文翻譯)

在Amazon SageMaker上以大規模建立、訓練和部署機器學習模型,應用於計算流體力學、基因組學、自動駕駛和數值優化等不同領域。

主要特點:
- 了解高性能計算(HPC)的需求
- 使用Amazon SageMaker建立、訓練和部署具有數十億參數的大型機器學習模型
- 學習在HPC環境中實施大規模機器學習的最佳實踐和架構

書籍描述:
機器學習和AWS上的高性能計算可在各個行業和新興應用中運行計算密集型工作負載。其應用案例可以與計算流體力學(CFD)、基因組學和自動駕駛等不同垂直領域相關聯。

本書提供從HPC概念到存儲和網絡的端到端指導。然後,進一步介紹如何使用SageMaker Studio和EMR處理大型數據集的實際示例。接下來,您將學習使用分佈式訓練建立、訓練和部署大型模型。後面的章節還將指導您使用SageMaker和IoT Greengrass將模型部署到邊緣設備,以及針對低延遲用例進行ML模型的性能優化。

通過閱讀本書,您將能夠使用AWS上的HPC建立、訓練和部署自己的大規模機器學習應用,遵循行業最佳實踐,解決應用生命周期中遇到的關鍵痛點。

學到的內容:
- 探索HPC應用的數據管理、存儲和快速網絡
- 重點關注使用Spark分析和可視化大量數據
- 使用SageMaker分佈式訓練訓練視覺轉換模型
- 在雲端和邊緣環境中大規模部署和管理機器學習模型
- 瞭解低延遲工作負載的ML模型性能優化
- 將HPC應用於計算流體力學、基因組學、自動駕駛和優化等行業領域

適合閱讀對象:
本書從HPC概念開始,但預期讀者具備機器學習知識。本書適合對使用大型數據集進行訓練大型模型、使用AWS上的分佈式訓練概念進行大規模部署和低延遲用例的性能優化感興趣的機器學習工程師和數據科學家。對於需要將ML模型應用於大規模應用中的數值優化、計算流體力學、自動駕駛和基因組學等領域的從業人員,本書也將非常有用。

目錄大綱

1. High-Performance Computing Fundamentals
2. Data Management and Transfer
3. Compute and Networking
4. Data Storage
5. Data Analysis
6. Distributed Training of Machine Learning Models
7. Deploying Machine Learning Models at Scale
8. Optimizing and Managing Machine Learning Models for Edge Deployment
9. Performance Optimization for Real-Time Inference
10. Data Visualization
11. Computational Fluid Dynamics
12. Genomics
13. Autonomous Vehicles
14. Numerical Optimization

目錄大綱(中文翻譯)

1. 高效能運算基礎
2. 資料管理與傳輸
3. 運算與網路
4. 資料儲存
5. 資料分析
6. 分散式機器學習模型訓練
7. 大規模部署機器學習模型
8. 優化與管理邊緣部署的機器學習模型
9. 即時推論的效能優化
10. 資料視覺化
11. 計算流體力學
12. 基因組學
13. 自駕車
14. 數值最佳化