Stream Processing with Apache Spark: Best Practices for Scaling and Optimizing Apache Spark
暫譯: 使用 Apache Spark 進行串流處理:擴展與優化 Apache Spark 的最佳實踐
Francois Garillot, Gerard Maas
買這商品的人也買了...
-
$1,320$1,254 -
$2,460$2,337 -
$680$530 -
$403Elasticsearch 技術解析與實戰
-
$2,200$2,090 -
$450$356 -
$990Kafka Streams in Action: Real-time apps and microservices with the Kafka Streaming API
-
$1,575$1,496 -
$1,824Cybersecurity Ops with bash: Attack, Defend, and Analyze from the Command Line
-
$352別拿相關當因果:因果關係簡易入門
-
$446JRockit 權威指南 : 深入理解 JVM
-
$454JVM G1 源碼分析和調優
-
$1,188Practical Haskell: A Real World Guide to Programming
-
$520$442 -
$600$468 -
$580$458 -
$380$323 -
$407AWS 解決方案架構師學習指南 (第2版·SAA-C01)
-
$599$473 -
$780$616 -
$499$394 -
$750$593 -
$980$774 -
$636$604 -
$900$855
相關主題
商品描述
To build analytics tools that provide faster insights, knowing how to process data in real time is a must, and moving from batch processing to stream processing is absolutely required. Fortunately, the Spark in-memory framework/platform for processing data has added an extension devoted to fault-tolerant stream processing: Spark Streaming.
If you're familiar with Apache Spark and want to learn how to implement it for streaming jobs, this practical book is a must.
- Understand how Spark Streaming fits in the big picture
- Learn core concepts such as Spark RDDs, Spark Streaming clusters, and the fundamentals of a DStream
- Discover how to create a robust deployment
- Dive into streaming algorithmics
- Learn how to tune, measure, and monitor Spark Streaming
商品描述(中文翻譯)
為了建立提供更快洞察的分析工具,了解如何實時處理數據是必須的,並且從批處理轉向流處理是絕對必要的。幸運的是,Spark 的內存框架/平台已經增加了一個專門用於容錯流處理的擴展:Spark Streaming。
如果您熟悉 Apache Spark 並想學習如何為流式作業實現它,這本實用的書籍是必備之選。
- 了解 Spark Streaming 在整體架構中的位置
- 學習核心概念,如 Spark RDD、Spark Streaming 集群以及 DStream 的基本原理
- 探索如何創建穩健的部署
- 深入了解流式算法
- 學習如何調整、測量和監控 Spark Streaming
作者簡介
Gerard Maas is a Principal Engineer at Lightbend, where he works on the seamless integration of Structured Streaming and other scalable stream processing technologies into the Lightbend Platform. Previously, he worked at a cloud-native IoT startup, where he led the data processing team on building the streaming pipelines that pushed Spark Streaming to its limits in terms of throughput. Back then, he published the first comprehensive guide to tune Spark Streaming performance.
Gerard has held leading roles at several startups and large enterprises, building data science governance, cloud-native IoT platforms, telecom platforms, and scalable APIs. He is a regular speaker at technology conferences and contributes to small and large open source projects. Gerard has a degree in Computer Engineering from the Simón Bolívar University, Venezuela. You can find him on twitter as @maasg.
François Garillot is based in Seattle, where he works on distributed computing at Facebook. He received a Ph.D. from École Polytechnique in 2011, and worked on Spark Streaming's back-pressure while working at Lightbend in 2015. His interests include type systems, leveraging programming languages to make analytics simpler to express, and a passion for Scala, Spark, and roasted arabica. When not at work, he can be found enjoying the mountains of the Pacific Northwest.
作者簡介(中文翻譯)
Gerard Maas 是 Lightbend 的首席工程師,他專注於將結構化流處理(Structured Streaming)和其他可擴展的流處理技術無縫整合到 Lightbend 平台中。之前,他在一家雲原生物聯網(IoT)初創公司工作,負責數據處理團隊,建立流式管道,將 Spark Streaming 的吞吐量推向極限。當時,他發表了第一本全面的指南,以調整 Spark Streaming 的性能。
Gerard 在多家初創公司和大型企業中擔任過領導角色,建立數據科學治理、雲原生物聯網平台、電信平台和可擴展的 API。他是技術會議的常客演講者,並參與大小型的開源項目。Gerard 擁有委內瑞拉西蒙·玻利瓦爾大學的計算機工程學位。你可以在 Twitter 上找到他,帳號是 @maasg。
François Garillot 現居西雅圖,在 Facebook 從事分散式計算工作。他於 2011 年獲得 École Polytechnique 的博士學位,並在 2015 年於 Lightbend 工作時研究 Spark Streaming 的反壓(back-pressure)。他的興趣包括類型系統、利用程式語言簡化分析表達,以及對 Scala、Spark 和烘焙阿拉比卡咖啡的熱情。當不在工作時,他喜歡在太平洋西北的山區享受大自然。