Data Wrangling with Python: Simplify your ETL processes with these hands-on data sanitation tips, tricks and best practices
暫譯: 使用 Python 進行數據處理：簡化您的 ETL 流程，掌握這些實用的數據清理技巧、竅門和最佳實踐

Name: Data Wrangling with Python: Simplify your ETL processes with these hands-on data sanitation tips, tricks and best practices
Price: 1568 TWD
Availability: OnlineOnly
Author: Tirthajyoti Sarkar, Shubhadeep Roychowdhury
ISBN: 1789800110

Tirthajyoti Sarkar, Shubhadeep Roychowdhury

出版商: Packt Publishing
出版日期: 2019-02-28
售價: $1,650
貴賓價: 9.5 折 $1,568
語言: 英文
頁數: 460
裝訂: Paperback
ISBN: 1789800110
ISBN-13: 9781789800111
相關分類: Python
相關翻譯: Python數據整理 (簡中版)

海外代購書籍(需單獨結帳)

前往其他有現貨版本↗️

商品描述

Data is the new oil but it comes as crude, just like oil. To do anything meaningful - modeling, visualization, machine learning, for predictive analysis - you first need to wrestle and wrangle with data. This book teaches the essential basics of data wrangling using Python.

Key Features

Focuses on essential basics of wrangling to get you up and running with analysis in no time
Teaches the tricks and know-how of "how to solve data wrangling problems"
Added bonus topics - random data generation, data integrity checks

Book Description

To practice high-quality science with data, first you need to make sure it is properly sourced, cleaned, formatted, and pre-processed. This book teaches you the most essential basics of this invaluable component of the data science pipeline - data wrangling.

What you will learn

Able to manipulate complex and simple data structure using Python and it's built-in functions
Use the fundamental and advanced level of Pandas DataFrames and numpy.array
Manipulate them at run time
Extract and format data from various formats (textual) - normal text file, SQL, CSV, Excel, JSON, and XML
Perform web scraping using Python libraries such as BeautifulSoup4 and html5lib
Perform advanced string search and manipulation using Python and RegEX
Handle outliers, apply advanced programming tricks, and perform data imputation using Pandas
Basic descriptive statistics and plotting techniques in Python for quick examination of data
Practice data wrangling and modeling using the random data generation techniques

Who This Book Is For

Software professionals, web developers, database engineers, and business analysts who want to movetowards a career of full-fledged data scientist/analytics expert or whoever wants to use data analytics/machine learning to enrich their current personal or professional projects.Prior experience with Python is not an absolute requirement, however the knowledge of at least oneobject-oriented programming language (e.g. C/C++/Java/JavaScript), and high school level math is highlypreferred. It is a bonus if you have rudimentary idea about relational database and SQL.Even seasoned Python app/web developers can benefit from this book as it focuses on data engineering aspects

商品描述(中文翻譯)

資料是新的石油，但它的形態就像原油一樣。要進行任何有意義的工作——建模、視覺化、機器學習、預測分析——你首先需要與資料進行搏鬥和整理。本書教授使用 Python 進行資料整理的基本知識。

主要特點
- 專注於資料整理的基本知識，讓你能迅速開始分析
- 教授「如何解決資料整理問題」的技巧和方法
- 附加主題 - 隨機資料生成、資料完整性檢查

書籍描述
要用資料進行高品質的科學實踐，首先需要確保資料來源正確、清理乾淨、格式正確並進行預處理。本書教你這個資料科學流程中不可或缺的組成部分——資料整理的最基本知識。

你將學到的內容
- 能夠使用 Python 及其內建函數操作複雜和簡單的資料結構
- 使用 Pandas DataFrames 和 numpy.array 的基本和進階功能
- 在執行時操作它們
- 從各種格式（文本）中提取和格式化資料 - 普通文本檔案、SQL、CSV、Excel、JSON 和 XML
- 使用 Python 函式庫如 BeautifulSoup4 和 html5lib 進行網頁爬蟲
- 使用 Python 和 RegEX 進行進階字串搜尋和操作
- 處理異常值，應用進階程式設計技巧，並使用 Pandas 進行資料插補
- 在 Python 中進行基本的描述性統計和繪圖技術，以快速檢查資料
- 使用隨機資料生成技術進行資料整理和建模練習

本書適合對象
本書適合希望朝著全職資料科學家/分析專家職業發展的軟體專業人員、網頁開發者、資料庫工程師和商業分析師，或任何希望利用資料分析/機器學習來豐富其當前個人或專業項目的人。對 Python 的先前經驗並不是絕對必要，但至少了解一種物件導向程式語言（例如 C/C++/Java/JavaScript）和高中數學知識是非常受歡迎的。如果你對關聯資料庫和 SQL 有基本了解，那將是加分項。即使是經驗豐富的 Python 應用程式/網頁開發者也能從本書中受益，因為它專注於資料工程的各個方面。