Pandas Cookbook
暫譯: Pandas 食譜

Theodore Petrou

買這商品的人也買了...

相關主題

商品描述

Key Features

  • Use the power of pandas to solve most complex scientific computing problems with ease
  • Leverage fast, robust data structures in pandas to gain useful insights from your data
  • Practical, easy to implement recipes for quick solutions to common problems in data using pandas

Book Description

Pandas is one of the most powerful, flexible, and efficient scientific computing packages in Python. With this book, you will explore data in pandas through dozens of practice problems with detailed solutions in iPython notebooks.

This book will provide you with clean, clear recipes, and solutions that explain how to handle common data manipulation and scientific computing tasks with pandas. You will work with different types of datasets, and perform data manipulation and data wrangling effectively. You will explore the power of pandas DataFrames and find out about boolean and multi-indexing. Tasks related to statistical and time series computations, and how to implement them in financial and scientific applications are also covered in this book.

By the end of this book, you will have all the knowledge you need to master pandas, and perform fast and accurate scientific computing.

What you will learn

  • Master the fundamentals of pandas to quickly begin exploring any dataset
  • Isolate any subset of data by properly selecting and querying the data
  • Split data into independent groups before applying aggregations and transformations to each group
  • Restructure data into a tidy form to make data analysis and visualization easier
  • Prepare messy real-world datasets for machine learning
  • Combine and merge data from different sources through pandas SQL-like operations
  • Utilize pandas unparalleled time series functionality
  • Create beautiful and insightful visualizations through pandas direct hooks to matplotlib and seaborn

About the Author

Theodore Petrou is a data scientist and the founder of Dunder Data, a professional educational company focusing on exploratory data analysis. He is also the head of Houston Data Science, a meetup group with more than 2,000 members that has the primary goal of getting local data enthusiasts together in the same room to practice data science. Before founding Dunder Data, Ted was a data scientist at Schlumberger, a large oil services company, where he spent the vast majority of his time exploring data.

Some of his projects included using targeted sentiment analysis to discover the root cause of part failure from engineer text, developing customized client/server dashboarding applications, and real-time web services to avoid the mispricing of sales items. Ted received his masters degree in statistics from Rice University, and used his analytical skills to play poker professionally and teach math before becoming a data scientist. Ted is a strong supporter of learning through practice and can often be found answering questions about pandas on Stack Overflow.

Table of Contents

  1. Pandas Foundations
  2. Essential DataFrame Operations
  3. Beginning Data Analysis
  4. Selecting Subsets of Data
  5. Boolean Indexing
  6. Index Alignment
  7. Grouping for Aggregation, Filtration and Transformation
  8. Restructuring Data into Tidy Form
  9. Joining multiple pandas objects
  10. Time Series
  11. Visualization

商品描述(中文翻譯)

**主要特點**

- 利用 pandas 的強大功能輕鬆解決大多數複雜的科學計算問題
- 利用 pandas 中快速且穩健的數據結構,從數據中獲取有用的見解
- 實用且易於實施的食譜,快速解決使用 pandas 處理常見數據問題

**書籍描述**

Pandas 是 Python 中最強大、靈活且高效的科學計算套件之一。透過這本書,您將通過數十個練習問題和詳細解答在 iPython 筆記本中探索 pandas 中的數據。

這本書將為您提供清晰、明確的食譜和解決方案,解釋如何使用 pandas 處理常見的數據操作和科學計算任務。您將處理不同類型的數據集,並有效地進行數據操作和數據整理。您將探索 pandas DataFrame 的強大功能,並了解布林索引和多重索引。與統計和時間序列計算相關的任務,以及如何在金融和科學應用中實現這些計算的內容也將在本書中涵蓋。

在本書結束時,您將擁有掌握 pandas 所需的所有知識,並能進行快速且準確的科學計算。

**您將學到的內容**

- 掌握 pandas 的基本原理,快速開始探索任何數據集
- 通過正確選擇和查詢數據,隔離任何數據子集
- 在對每個組應用聚合和轉換之前,將數據拆分為獨立的組
- 將數據重組為整潔的形式,以便更容易進行數據分析和可視化
- 為機器學習準備雜亂的現實世界數據集
- 通過 pandas 類 SQL 操作合併來自不同來源的數據
- 利用 pandas 無與倫比的時間序列功能
- 通過 pandas 直接連接到 matplotlib 和 seaborn 創建美觀且有見地的可視化

**關於作者**

**Theodore Petrou** 是一位數據科學家,也是 Dunder Data 的創始人,這是一家專注於探索性數據分析的專業教育公司。他還是 Houston Data Science 的負責人,這是一個擁有超過 2,000 名成員的聚會小組,主要目標是讓當地的數據愛好者聚在一起實踐數據科學。在創立 Dunder Data 之前,Ted 是大型石油服務公司 Schlumberger 的數據科學家,他在那裡花了大部分時間探索數據。

他的部分項目包括使用針對性的情感分析來發現工程文本中部件故障的根本原因,開發定制的客戶/服務器儀表板應用程序,以及實時網絡服務以避免銷售項目的錯誤定價。Ted 在萊斯大學獲得統計學碩士學位,並利用他的分析技能專業打撲克和教授數學,然後成為數據科學家。Ted 是通過實踐學習的堅定支持者,經常在 Stack Overflow 上回答有關 pandas 的問題。

**目錄**

1. Pandas 基礎
2. 基本 DataFrame 操作
3. 開始數據分析
4. 選擇數據子集
5. 布林索引
6. 索引對齊
7. 聚合、過濾和轉換的分組
8. 將數據重組為整潔的形式
9. 合併多個 pandas 對象
10. 時間序列
11. 可視化