Python Data Analysis - Second Edition

Armando Fandango

買這商品的人也買了...

商品描述

Key Features

  • Find, manipulate, and analyze your data using the Python 3.5 libraries
  • Perform advanced, high-performance linear algebra and mathematical calculations with clean and efficient Python code
  • An easy-to-follow guide with realistic examples that are frequently used in real-world data analysis projects.

Book Description

Data analysis techniques generate useful insights from small and large volumes of data. Python, with its strong set of libraries, has become a popular platform to conduct various data analysis and predictive modeling tasks.

With this book, you will learn how to process and manipulate data with Python for complex analysis and modeling. We learn data manipulations such as aggregating, concatenating, appending, cleaning, and handling missing values, with NumPy and Pandas. The book covers how to store and retrieve data from various data sources such as SQL and NoSQL, CSV fies, and HDF5. We learn how to visualize data using visualization libraries, along with advanced topics such as signal processing, time series, textual data analysis, machine learning, and social media analysis.

The book covers a plethora of Python modules, such as matplotlib, statsmodels, scikit-learn, and NLTK. It also covers using Python with external environments such as R, Fortran, C/C++, and Boost libraries.

What you will learn

  • Install open source Python modules such NumPy, SciPy, Pandas, stasmodels, scikit-learn,theano, keras, and tensorflow on various platforms
  • Prepare and clean your data, and use it for exploratory analysis
  • Manipulate your data with Pandas
  • Retrieve and store your data from RDBMS, NoSQL, and distributed filesystems such as HDFS and

商品描述(中文翻譯)

主要特點

- 使用Python 3.5庫找到、操作和分析數據
- 使用乾淨高效的Python代碼進行高級高性能線性代數和數學計算
- 提供易於理解的指南,包含在真實世界數據分析項目中經常使用的實例

書籍描述

數據分析技術可以從小到大量的數據中獲得有用的洞察。Python憑藉其強大的庫集合已成為進行各種數據分析和預測建模任務的流行平台。

通過本書,您將學習如何使用Python進行複雜分析和建模的數據處理和操作。我們使用NumPy和Pandas進行數據操作,如聚合、連接、附加、清理和處理缺失值。本書涵蓋了如何從SQL和NoSQL、CSV文件和HDF5等各種數據源中存儲和檢索數據。我們學習如何使用可視化庫來可視化數據,以及信號處理、時間序列、文本數據分析、機器學習和社交媒體分析等高級主題。

本書涵蓋了許多Python模塊,如matplotlib、statsmodels、scikit-learn和NLTK。它還介紹了如何將Python與R、Fortran、C/C++和Boost庫等外部環境一起使用。

您將學到的內容

- 在各種平台上安裝開源Python模塊,如NumPy、SciPy、Pandas、statsmodels、scikit-learn、theano、keras和tensorflow
- 準備和清理數據,並將其用於探索性分析
- 使用Pandas操作數據
- 從RDBMS、NoSQL和分布式文件系統(如HDFS)檢索和存儲數據