Fast Data Processing Systems with SMACK Stack

Raul Estrada

  • 出版商: Packt Publishing
  • 出版日期: 2016-12-22
  • 售價: $2,170
  • 貴賓價: 9.5$2,062
  • 語言: 英文
  • 頁數: 348
  • 裝訂: Paperback
  • ISBN: 1786467208
  • ISBN-13: 9781786467201
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Key Features

  • This highly practical guide shows you how to use the best of the big data technologies to solve your response-critical problems
  • Learn the art of making cheap-yet-effective big data architecture without using complex Greek-letter architectures
  • Use this easy-to-follow guide to build fast data processing systems for your organization

Book Description

SMACK is an open source full stack for big data architecture. It is a combination of Spark, Mesos, Akka, Cassandra, and Kafka. This stack is the newest technique developers have begun to use to tackle critical real-time analytics for big data. This highly practical guide will teach you how to integrate these technologies to create a highly efficient data analysis system for fast data processing.

We’ll start off with an introduction to SMACK and show you when to use it. First you’ll get to grips with functional thinking and problem solving using Scala. Next you’ll come to understand the Akka architecture. Then you’ll get to know how to improve the data structure architecture and optimize resources using Apache Spark.

Moving forward, you’ll learn how to perform linear scalability in databases with Apache Cassandra. You’ll grasp the high throughput distributed messaging systems using Apache Kafka. We’ll show you how to build a cheap but effective cluster infrastructure with Apache Mesos. Finally, you will deep dive into the different aspect of SMACK using a few case studies.

By the end of the book, you will be able to integrate all the components of the SMACK stack and use them together to achieve highly effective and fast data processing.

What you will learn

  • Design and implement a fast data Pipeline architecture
  • Think and solve programming challenges in a functional way with Scala
  • Learn to use Akka, the actors model implementation for the JVM
  • Make on memory processing and data analysis with Spark to solve modern business demands
  • Build a powerful and effective cluster infrastructure with Mesos and Docker
  • Manage and consume unstructured and No-SQL data sources with Cassandra
  • Consume and produce messages in a massive way with Kafka

About the Author

Raúl Estrada is a programmer since 1996 and Java Developer since 2001. He loves functional languages such as Scala, Elixir, Clojure, and Haskell. He also loves all the topics related to Computer Science. With more than 12 years of experience in High Availability and Enterprise Software, he has designed and implemented architectures since 2003.

His specialization is in systems integration and has participated in projects mainly related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys Mobile Programming and Game Development. He considers himself a programmer before an architect, engineer, or developer.

He is also a Crossfitter in San Francisco, Bay Area, now focused on Open Source projects related to Data Pipelining such as Apache Flink, Apache Kafka, and Apache Beam. Raul is a supporter of free software, and enjoys to experiment with new technologies, frameworks, languages, and methods.

Table of Contents

  1. An Introduction to SMACK
  2. The Model - Scala and Akka
  3. The Engine - Apache Spark
  4. The Storage - Apache Cassandra
  5. The Broker - Apache Kafka
  6. The Manager - Apache Mesos
  7. Study Case 1 - Spark and Cassandra
  8. Study Case 2 - Connectors
  9. Study Case 3 - Mesos and Docker

商品描述(中文翻譯)

關鍵特點
- 本書是一個非常實用的指南,教你如何利用最佳的大數據技術來解決關鍵的回應問題。
- 學習如何在不使用複雜的希臘字母架構的情況下,設計便宜而有效的大數據架構。
- 使用這本易於遵循的指南,為你的組織建立快速數據處理系統。

書籍描述
SMACK 是一個開源的全棧大數據架構,結合了 Spark、Mesos、Akka、Cassandra 和 Kafka。這個堆疊是開發者開始使用的最新技術,用以應對大數據的關鍵實時分析。本書將教你如何整合這些技術,創建一個高效的數據分析系統,以實現快速數據處理。

我們將從 SMACK 的介紹開始,並告訴你何時使用它。首先,你將學會使用 Scala 進行功能性思考和問題解決。接著,你將了解 Akka 架構。然後,你將學會如何改善數據結構架構並使用 Apache Spark 優化資源。

接下來,你將學習如何在數據庫中使用 Apache Cassandra 實現線性擴展。你將掌握使用 Apache Kafka 的高吞吐量分佈式消息系統。我們將展示如何使用 Apache Mesos 建立一個便宜但有效的集群基礎設施。最後,你將通過幾個案例研究深入了解 SMACK 的不同方面。

到書籍結束時,你將能夠整合 SMACK 堆疊的所有組件,並將它們一起使用,以實現高效且快速的數據處理。

你將學到的內容
- 設計和實現快速數據管道架構
- 以功能性方式思考和解決編程挑戰,使用 Scala
- 學習使用 Akka,JVM 的演員模型實現
- 使用 Spark 進行內存處理和數據分析,以解決現代商業需求
- 使用 Mesos 和 Docker 建立強大而有效的集群基礎設施
- 管理和消耗非結構化和 No-SQL 數據來源,使用 Cassandra
- 以大規模方式消費和生產消息,使用 Kafka

關於作者
Raúl Estrada 自 1996 年以來一直是程序員,自 2001 年以來是 Java 開發者。他熱愛功能性語言,如 Scala、Elixir、Clojure 和 Haskell。他也熱愛所有與計算機科學相關的主題。擁有超過 12 年的高可用性和企業軟體經驗,自 2003 年以來,他設計和實施了多種架構。

他的專業領域是系統整合,並參與了主要與金融行業相關的項目。他曾擔任 BEA Systems 和 Oracle Inc. 的企業架構師,但他也喜歡移動編程和遊戲開發。他認為自己是一名程序員,優於架構師、工程師或開發者。

他也是位於舊金山灣區的 Crossfitter,現在專注於與數據管道相關的開源項目,如 Apache Flink、Apache Kafka 和 Apache Beam。Raul 是自由軟體的支持者,喜歡嘗試新技術、框架、語言和方法。

目錄
1. SMACK 介紹
2. 模型 - Scala 和 Akka
3. 引擎 - Apache Spark
4. 存儲 - Apache Cassandra
5. 代理 - Apache Kafka
6. 管理者 - Apache Mesos
7. 案例研究 1 - Spark 和 Cassandra
8. 案例研究 2 - 連接器
9. 案例研究 3 - Mesos 和 Docker