Apache Hive Essentials: Essential techniques to help you process, and get unique insights from, big data, 2nd Edition

Dayong Du

  • 出版商: Packt Publishing
  • 出版日期: 2018-06-29
  • 定價: $1,050
  • 售價: 8.0$840
  • 語言: 英文
  • 頁數: 210
  • 裝訂: Paperback
  • ISBN: 1788995090
  • ISBN-13: 9781788995092
  • 相關分類: Hadoop大數據 Big-data
  • 立即出貨 (庫存=1)

買這商品的人也買了...

商品描述

This book takes you on a fantastic journey to discover the attributes of big data using Apache Hive.

Key Features

  • Grasp the skills needed to write efficient Hive queries to analyze the Big Data
  • Discover how Hive can coexist and work with other tools within the Hadoop ecosystem
  • Uses practical, example-oriented scenarios to cover all the newly released features of Apache Hive 2.3.3

Book Description

In this book, we prepare you for your journey into big data by frstly introducing you to backgrounds in the big data domain, alongwith the process of setting up and getting familiar with your Hive working environment.

Next, the book guides you through discovering and transforming the values of big data with the help of examples. It also hones your skills in using the Hive language in an effcient manner. Toward the end, the book focuses on advanced topics, such as performance, security, and extensions in Hive, which will guide you on exciting adventures on this worthwhile big data journey.

By the end of the book, you will be familiar with Hive and able to work effeciently to find solutions to big data problems

What you will learn

  • Create and set up the Hive environment
  • Discover how to use Hive's definition language to describe data
  • Discover interesting data by joining and filtering datasets in Hive
  • Transform data by using Hive sorting, ordering, and functions
  • Aggregate and sample data in different ways
  • Boost Hive query performance and enhance data security in Hive
  • Customize Hive to your needs by using user-defined functions and integrate it with other tools

Who This Book Is For

If you are a data analyst, developer, or simply someone who wants to quickly get started with Hive to explore and analyze Big Data in Hadoop, this is the book for you. Since Hive is an SQL-like language, some previous experience with SQL will be useful to get the most out of this book.

Table of Contents

  1. OVERVIEW OF BIG DATA AND HIVE
  2. SETTING UP THE HIVE ENVIRONMENT
  3. DATA DEFINITION AND DESCRIPTION
  4. Data Correlation and Scope
  5. DATA MANIPULATION
  6. DATA AGGREGATION AND SAMPLING
  7. Extensibility Considerations
  8. Working with Other Tools
  9. Performance Considerations
  10. Security Considerations

商品描述(中文翻譯)

這本書帶領你踏上一段奇幻之旅,探索使用Apache Hive的大數據特性。

主要特點:
- 掌握撰寫高效Hive查詢以分析大數據所需的技能
- 發現Hive如何與Hadoop生態系統中的其他工具共存並協同工作
- 使用實際的以例子為導向的情境,涵蓋Apache Hive 2.3.3的所有新功能

書籍描述:
在這本書中,我們首先介紹大數據領域的背景,並引導您建立和熟悉Hive工作環境,為您進入大數據的旅程做好準備。

接下來,本書將通過示例指導您發現和轉換大數據的值。它還將提升您使用Hive語言的技能,以高效的方式使用Hive。最後,本書專注於高級主題,如性能、安全性和Hive的擴展,將引導您在這個有價值的大數據旅程中進行令人興奮的冒險。

通過閱讀本書,您將熟悉Hive,並能夠高效地解決大數據問題。

你將學到:
- 創建和設置Hive環境
- 發現如何使用Hive的定義語言描述數據
- 通過在Hive中連接和篩選數據集來發現有趣的數據
- 使用Hive的排序、排序和函數來轉換數據
- 以不同方式聚合和抽樣數據
- 提升Hive查詢性能並增強數據安全性
- 通過使用自定義函數自定義Hive並將其與其他工具集成

本書適合對象:
如果您是數據分析師、開發人員或只是想快速開始使用Hive在Hadoop中探索和分析大數據的人,這本書非常適合您。由於Hive是一種類似SQL的語言,具有SQL的先前經驗將有助於您充分利用本書。

目錄:
1. 大數據和Hive概述
2. 設置Hive環境
3. 數據定義和描述
4. 數據相關性和範圍
5. 數據操作
6. 數據聚合和抽樣
7. 可擴展性考慮
8. 與其他工具合作
9. 性能考慮
10. 安全性考慮