Python Data Analysis Cookbook
暫譯: Python 數據分析食譜
Ivan Idris
- 出版商: Packt Publishing
- 出版日期: 2016-07-22
- 售價: $2,200
- 貴賓價: 9.5 折 $2,090
- 語言: 英文
- 頁數: 462
- 裝訂: Paperback
- ISBN: 178528228X
- ISBN-13: 9781785282287
-
相關分類:
Python、程式語言、Data Science
海外代購書籍(需單獨結帳)
相關主題
商品描述
Key Features
- Analyze Big Data sets, create attractive visualizations, and manipulate and process various data types
- Packed with rich recipes to help you learn and explore amazing algorithms for statistics and machine learning
- Authored by Ivan Idris, expert in python programming and proud author of eight highly reviewed books
Book Description
Data analysis is a rapidly evolving field and Python is a multi-paradigm programming language suitable for object-oriented application development and functional design patterns. As Python offers a range of tools and libraries for all purposes, it has slowly evolved as the primary language for data science, including topics on: data analysis, visualization, and machine learning.
Python Data Analysis Cookbook focuses on reproducibility and creating production-ready systems. You will start with recipes that set the foundation for data analysis with libraries such as matplotlib, NumPy, and pandas. You will learn to create visualizations by choosing color maps and palettes then dive into statistical data analysis using distribution algorithms and correlations. You’ll then help you find your way around different data and numerical problems, get to grips with Spark and HDFS, and then set up migration scripts for web mining.
In this book, you will dive deeper into recipes on spectral analysis, smoothing, and bootstrapping methods. Moving on, you will learn to rank stocks and check market efficiency, then work with metrics and clusters. You will achieve parallelism to improve system performance by using multiple threads and speeding up your code.
By the end of the book, you will be capable of handling various data analysis techniques in Python and devising solutions for problem scenarios.
What You Will Learn
- Set up reproducible data analysis
- Clean and transform data
- Apply advanced statistical analysis
- Create attractive data visualizations
- Web scrape and work with databases, Hadoop, and Spark
- Analyze images and time series data
- Mine text and analyze social networks
- Use machine learning and evaluate the results
- Take advantage of parallelism and concurrency
About the Author
Ivan Idris was born in Bulgaria to Indonesian parents. He moved to the Netherlands and graduated in experimental physics. His graduation thesis had a strong emphasis on applied computer science. After graduating, he worked for several companies as a software developer, data warehouse developer, and QA analyst.
His professional interests are business intelligence, big data, and cloud computing. He enjoys writing clean, testable code and interesting technical articles. He is the author of NumPy Beginner's Guide, NumPy Cookbook, Learning NumPy, and Python Data Analysis, all by Packt Publishing.
Table of Contents
- Laying the Foundation for Reproducible Data Analysis
- Creating Attractive Data Visualizations
- Statistical Data Analysis and Probability
- Dealing with Data and Numerical Issues
- Web Mining, Databases, and Big Data
- Signal Processing and Timeseries
- Selecting Stocks with Financial Data Analysis
- Text Mining and Social Network Analysis
- Ensemble Learning and Dimensionality Reduction
- Evaluating Classifi ers, Regressors, and Clusters
- Analyzing Images
- Parallelism and Performance
- Glossary
- Function Reference
商品描述(中文翻譯)
#### 主要特點
- 分析大型數據集,創建吸引人的可視化,並操作和處理各種數據類型
- 包含豐富的食譜,幫助您學習和探索統計學和機器學習的驚人算法
- 由 Python 程式設計專家 Ivan Idris 撰寫,他是八本高評價書籍的自豪作者
#### 書籍描述
數據分析是一個快速發展的領域,而 Python 是一種多範式的程式設計語言,適合物件導向應用程式開發和函數式設計模式。由於 Python 提供了各種用途的工具和庫,它逐漸演變為數據科學的主要語言,涵蓋數據分析、可視化和機器學習等主題。
《Python 數據分析食譜》專注於可重現性和創建生產就緒的系統。您將從使用 matplotlib、NumPy 和 pandas 等庫的食譜開始,為數據分析奠定基礎。您將學習通過選擇顏色映射和調色板來創建可視化,然後深入使用分佈算法和相關性進行統計數據分析。接著,您將幫助自己熟悉不同的數據和數值問題,掌握 Spark 和 HDFS,然後設置網頁挖掘的遷移腳本。
在本書中,您將深入了解光譜分析、平滑和自助法等食譜。接下來,您將學習如何對股票進行排名並檢查市場效率,然後處理指標和聚類。您將通過使用多線程來實現並行性,以提高系統性能並加速您的代碼。
到本書結束時,您將能夠在 Python 中處理各種數據分析技術並為問題場景設計解決方案。
#### 您將學到的內容
- 設置可重現的數據分析
- 清理和轉換數據
- 應用高級統計分析
- 創建吸引人的數據可視化
- 網頁抓取並處理數據庫、Hadoop 和 Spark
- 分析圖像和時間序列數據
- 挖掘文本並分析社交網絡
- 使用機器學習並評估結果
- 利用並行性和併發性
#### 關於作者
**Ivan Idris** 出生於保加利亞,父母是印尼人。他移居荷蘭並獲得實驗物理學學位。他的畢業論文強調應用計算機科學。畢業後,他在幾家公司擔任軟體開發人員、數據倉庫開發人員和 QA 分析師。
他的專業興趣包括商業智能、大數據和雲計算。他喜歡編寫乾淨、可測試的代碼和有趣的技術文章。他是《NumPy 初學者指南》、《NumPy 食譜》、《學習 NumPy》和《Python 數據分析》的作者,這些書籍均由 Packt Publishing 出版。
#### 目錄
1. 為可重現的數據分析奠定基礎
2. 創建吸引人的數據可視化
3. 統計數據分析和概率
4. 處理數據和數值問題
5. 網頁挖掘、數據庫和大數據
6. 信號處理和時間序列
7. 使用金融數據分析選擇股票
8. 文本挖掘和社交網絡分析
9. 集成學習和降維
10. 評估分類器、回歸器和聚類
11. 分析圖像
12. 並行性和性能
13. 詞彙表
14. 函數參考