Practical Enterprise Data Lake Insights: Handle Data-Driven Challenges in an Enterprise Big Data Lake

Saurabh Gupta, Venkata Giri

  • 出版商: Apress
  • 出版日期: 2018-06-28
  • 定價: $1,650
  • 售價: 6.0$990
  • 語言: 英文
  • 頁數: 327
  • 裝訂: Paperback
  • ISBN: 1484235215
  • ISBN-13: 9781484235218
  • 相關分類: 大數據 Big-data
  • 立即出貨 (庫存 < 3)

商品描述

Use this practical guide to successfully handle the challenges encountered when designing an enterprise data lake and learn industry best practices to resolve issues.
 
When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data. Starting from sourcing data into the Hadoop ecosystem, you will go through stages that can bring up tough questions such as data processing, data querying, and security. Concepts such as change data capture and data streaming are covered. The book takes an end-to-end solution approach in a data lake environment that includes data security, high availability, data processing, data streaming, and more.
 
Each chapter includes application of a concept, code snippets, and use case demonstrations to provide you with a practical approach. You will learn the concept, scope, application, and starting point.
 
What You'll Learn
  • Get to know data lake architecture and design principles
  • Implement data capture and streaming strategies
  • Implement data processing strategies in Hadoop
  • Understand the data lake security framework and availability model
Who This Book Is For
 
Big data architects and solution architects

商品描述(中文翻譯)

使用這本實用指南,成功應對設計企業數據湖時遇到的挑戰,並學習解決問題的行業最佳實踐。

在設計企業數據湖時,當你必須離開關聯世界的舒適區,學習處理非關聯數據的細微差別時,通常會遇到障礙。從將數據源引入Hadoop生態系統開始,你將經歷一系列階段,可能會引出一些棘手的問題,如數據處理、數據查詢和安全性。書中涵蓋了變更數據捕獲和數據流等概念。該書在數據湖環境中採用了端到端的解決方案方法,包括數據安全、高可用性、數據處理、數據流等。

每章包括概念應用、代碼片段和用例演示,以提供實用方法。你將學習概念、範圍、應用和起點。

你將學到什麼:
- 了解數據湖架構和設計原則
- 實施數據捕獲和數據流策略
- 在Hadoop中實施數據處理策略
- 理解數據湖安全框架和可用性模型

適合閱讀對象:
- 大數據架構師和解決方案架構師