Duckdb in Action

Needham, Mark, Hunger, Michael, Simons, Michael

  • 出版商: Manning
  • 出版日期: 2024-08-27
  • 售價: $2,210
  • 貴賓價: 9.5$2,100
  • 語言: 英文
  • 頁數: 312
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1633437256
  • ISBN-13: 9781633437258
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Dive into DuckDB and start processing gigabytes of data with ease--all with no data warehouse.

DuckDB is a cutting-edge SQL database that makes it incredibly easy to analyze big data sets right from your laptop. In DuckDB in Action you'll learn everything you need to know to get the most out of this awesome tool, keep your data secure on prem, and save you hundreds on your cloud bill. From data ingestion to advanced data pipelines, you'll learn everything you need to get the most out of DuckDB--all through hands-on examples.

Open up DuckDB in Action and learn how to:

- Read and process data from CSV, JSON and Parquet sources both locally and remote
- Write analytical SQL queries, including aggregations, common table expressions, window functions, special types of joins, and pivot tables
- Use DuckDB from Python, both with SQL and its "Relational"-API, interacting with databases but also data frames
- Prepare, ingest and query large datasets
- Build cloud data pipelines
- Extend DuckDB with custom functionality

Pragmatic and comprehensive, DuckDB in Action introduces the DuckDB database and shows you how to use it to solve common data workflow problems. You won't need to read through pages of documentation--you'll learn as you work. Get to grips with DuckDB's unique SQL dialect, learning to seamlessly load, prepare, and analyze data using SQL queries. Extend DuckDB with both Python and built-in tools such as MotherDuck, and gain practical insights into building robust and automated data pipelines.

Purchase of the print book includes a free eBook in PDF and ePub formats from Manning Publications.

About the technology

DuckDB makes data analytics fast and fun! You don't need to set up a Spark or run a cloud data warehouse just to process a few hundred gigabytes of data. DuckDB is easily embeddable in any data analytics application, runs on a laptop, and processes data from almost any source, including JSON, CSV, Parquet, SQLite and Postgres.

About the book

DuckDB in Action guides you example-by-example from setup, through your first SQL query, to advanced topics like building data pipelines and embedding DuckDB as a local data store for a Streamlit web app. You'll explore DuckDB's handy SQL extensions, get to grips with aggregation, analysis, and data without persistence, and use Python to customize DuckDB. A hands-on project accompanies each new topic, so you can see DuckDB in action.

What's inside

- Prepare, ingest and query large datasets
- Build cloud data pipelines
- Extend DuckDB with custom functionality
- Fast-paced SQL recap: From simple queries to advanced analytics

About the reader

For data pros comfortable with Python and CLI tools.

About the author

Mark Needham is a blogger and video creator at @?LearnDataWithMark. Michael Hunger leads product innovation for the Neo4j graph database. Michael Simons is a Java Champion, author, and Engineer at Neo4j.

商品描述(中文翻譯)

深入了解 DuckDB,輕鬆處理數以千計的數據,無需數據倉庫。

DuckDB 是一款尖端的 SQL 數據庫,讓您能夠輕鬆地從筆記型電腦分析大型數據集。在《DuckDB in Action》中,您將學到所有需要知道的知識,以充分利用這個出色的工具,確保您的數據安全存放在本地,並為您的雲端帳單節省數百元。從數據攝取到高級數據管道,您將通過實作範例學習如何充分利用 DuckDB。

打開《DuckDB in Action》,學習如何:

- 從本地和遠端的 CSV、JSON 和 Parquet 資料來源讀取和處理數據
- 撰寫分析性 SQL 查詢,包括聚合、公共表達式、視窗函數、特殊類型的聯接和樞紐分析表
- 從 Python 使用 DuckDB,無論是使用 SQL 還是其「關聯」API,與數據庫和數據框互動
- 準備、攝取和查詢大型數據集
- 建立雲端數據管道
- 使用自定義功能擴展 DuckDB

《DuckDB in Action》實用且全面,介紹了 DuckDB 數據庫,並展示如何使用它來解決常見的數據工作流程問題。您無需閱讀大量文檔——您將在實作中學習。掌握 DuckDB 獨特的 SQL 方言,學會如何無縫地加載、準備和分析數據,使用 SQL 查詢。通過 Python 和內建工具如 MotherDuck 擴展 DuckDB,並獲得構建穩健且自動化數據管道的實用見解。

購買印刷版書籍可獲得 Manning Publications 提供的免費 PDF 和 ePub 格式電子書。

關於技術

DuckDB 使數據分析變得快速且有趣!您無需設置 Spark 或運行雲端數據倉庫來處理幾百GB的數據。DuckDB 可以輕鬆嵌入任何數據分析應用程式,運行於筆記型電腦,並處理幾乎所有來源的數據,包括 JSON、CSV、Parquet、SQLite 和 Postgres。

關於本書

《DuckDB in Action》逐步引導您從設置開始,經過您的第一個 SQL 查詢,直到高級主題,如構建數據管道和將 DuckDB 嵌入作為 Streamlit 網頁應用的本地數據存儲。您將探索 DuckDB 的便捷 SQL 擴展,掌握聚合、分析和無持久性的數據,並使用 Python 自定義 DuckDB。每個新主題都有一個實作專案,讓您能夠看到 DuckDB 的實際應用。

內容概覽

- 準備、攝取和查詢大型數據集
- 建立雲端數據管道
- 使用自定義功能擴展 DuckDB
- 快速回顧 SQL:從簡單查詢到高級分析

關於讀者

適合熟悉 Python 和 CLI 工具的數據專業人士。

關於作者

Mark Needham 是 @?LearnDataWithMark 的部落客和視頻創作者。Michael Hunger 負責 Neo4j 圖形數據庫的產品創新。Michael Simons 是 Java Champion、作者及 Neo4j 的工程師。

作者簡介

Mark Needham is a blogger, and video creator at @]LearnDataWithMark, where his series on DuckDB offers viewers hands-on insights into practical database applications.

Michael Hunger works on the open source Neo4j graph database filling many roles, where leads the product innovation and developer product strategy.

Michael Simons is a Java Champion, author, and Staff Software Engineer at Neo4j and has been working professionally as a developer for more than 20 years.

作者簡介(中文翻譯)

Mark Needham 是一位部落客及 @LearnDataWithMark 的影片創作者,他的 DuckDB 系列為觀眾提供了實用資料庫應用的實作見解。

Michael Hunger 在開源的 Neo4j 圖形資料庫中擔任多個角色,負責產品創新和開發者產品策略。

Michael Simons 是一位 Java Champion、作者及 Neo4j 的資深軟體工程師,並且在專業開發領域工作超過 20 年。