Mastering Apache Spark 2.x - Second Edition

Romeo Kienzler

  • 出版商: Packt Publishing
  • 出版日期: 2017-07-20
  • 定價: $1,650
  • 售價: 8.0$1,320
  • 語言: 英文
  • 頁數: 354
  • 裝訂: Paperback
  • ISBN: 1786462745
  • ISBN-13: 9781786462749
  • 相關分類: Spark
  • 立即出貨 (庫存=1)

  • Mastering Apache Spark 2.x - Second Edition-preview-1
  • Mastering Apache Spark 2.x - Second Edition-preview-2
  • Mastering Apache Spark 2.x - Second Edition-preview-3
  • Mastering Apache Spark 2.x - Second Edition-preview-4
  • Mastering Apache Spark 2.x - Second Edition-preview-5
  • Mastering Apache Spark 2.x - Second Edition-preview-6
  • Mastering Apache Spark 2.x - Second Edition-preview-7
  • Mastering Apache Spark 2.x - Second Edition-preview-8
  • Mastering Apache Spark 2.x - Second Edition-preview-9
  • Mastering Apache Spark 2.x - Second Edition-preview-10
  • Mastering Apache Spark 2.x - Second Edition-preview-11
  • Mastering Apache Spark 2.x - Second Edition-preview-12
  • Mastering Apache Spark 2.x - Second Edition-preview-13
  • Mastering Apache Spark 2.x - Second Edition-preview-14
  • Mastering Apache Spark 2.x - Second Edition-preview-15
  • Mastering Apache Spark 2.x - Second Edition-preview-16
  • Mastering Apache Spark 2.x - Second Edition-preview-17
  • Mastering Apache Spark 2.x - Second Edition-preview-18
  • Mastering Apache Spark 2.x - Second Edition-preview-19
  • Mastering Apache Spark 2.x - Second Edition-preview-20
  • Mastering Apache Spark 2.x - Second Edition-preview-21
  • Mastering Apache Spark 2.x - Second Edition-preview-22
Mastering Apache Spark 2.x - Second Edition-preview-1

買這商品的人也買了...

商品描述

Advanced analytics on your Big Data with latest Apache Spark 2.x

About This Book

  • An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities.
  • Extend your data processing capabilities to process huge chunk of data in minimum time using advanced concepts in Spark.
  • Master the art of real-time processing with the help of Apache Spark 2.x

Who This Book Is For

If you are a developer with some experience with Spark and want to strengthen your knowledge of how to get around in the world of Spark, then this book is ideal for you. Basic knowledge of Linux, Hadoop and Spark is assumed. Reasonable knowledge of Scala is expected.

What You Will Learn

  • Examine Advanced Machine Learning and DeepLearning with MLlib, SparkML, SystemML, H2O and DeepLearning4J
  • Study highly optimised unified batch and real-time data processing using SparkSQL and Structured Streaming
  • Evaluate large-scale Graph Processing and Analysis using GraphX and GraphFrames
  • Apply Apache Spark in Elastic deployments using Jupyter and Zeppelin Notebooks, Docker, Kubernetes and the IBM Cloud
  • Understand internal details of cost based optimizers used in Catalyst, SystemML and GraphFrames
  • Learn how specific parameter settings affect overall performance of an Apache Spark cluster
  • Leverage Scala, R and python for your data science projects

In Detail

Apache Spark is an in-memory cluster-based parallel processing system that provides a wide range of functionalities such as graph processing, machine learning, stream processing, and

商品描述(中文翻譯)

使用最新的Apache Spark 2.x在您的大數據上進行高級分析

關於本書


  • 結合指導和實際示例的高級指南,擴展最新的Spark功能。

  • 使用Spark的高級概念,擴展數據處理能力,以在最短時間內處理大量數據。

  • 通過Apache Spark 2.x的幫助,掌握實時處理的技巧。

本書適合對象

如果您是一位具有一定Spark經驗的開發人員,並且希望加強在Spark世界中的知識,那麼本書非常適合您。假設您具有Linux、Hadoop和Spark的基本知識,並且對Scala有合理的了解。

您將學到什麼


  • 使用MLlib、SparkML、SystemML、H2O和DeepLearning4J進行高級機器學習和深度學習

  • 使用SparkSQL和Structured Streaming進行高度優化的統一批處理和實時數據處理

  • 使用GraphX和GraphFrames進行大規模圖形處理和分析

  • 使用Jupyter和Zeppelin Notebooks、Docker、Kubernetes和IBM Cloud在彈性部署中應用Apache Spark

  • 了解Catalyst、SystemML和GraphFrames中使用的基於成本的優化器的內部細節

  • 了解特定參數設置如何影響Apache Spark集群的整體性能

  • 在數據科學項目中利用Scala、R和Python

詳細內容

Apache Spark是一個基於內存的集群並行處理系統,提供了各種功能,如圖形處理、機器學習、流處理等。