Python Data Science Essentials - Second Edition

Alberto Boschetti, Luca Massaron

商品描述

Key Features

  • Quickly get familiar with data science using Python 3.5
  • Save time (and effort) with all the essential tools explained
  • Create effective data science projects and avoid common pitfalls with the help of examples and hints dictated by experience

Book Description

Fully expanded and upgraded, the second edition of Python Data Science Essentials takes you through all you need to know to suceed in data science using Python. Get modern insight into the core of Python data, including the latest versions of Jupyter notebooks, NumPy, pandas and scikit-learn. Look beyond the fundamentals with beautiful data visualizations with Seaborn and ggplot, web development with Bottle, and even the new frontiers of deep learning with Theano and TensorFlow.

Dive into building your essential Python 3.5 data science toolbox, using a single-source approach that will allow to to work with Python 2.7 as well. Get to grips fast with data munging and preprocessing, and all the techniques you need to load, analyse, and process your data. Finally, get a complete overview of principal machine learning algorithms, graph analysis techniques, and all the visualization and deployment instruments that make it easier to present your results to an audience of both data science experts and business users.

What you will learn

  • Set up your data science toolbox using a Python scientific environment on Windows, Mac, and Linux
  • Get data ready for your data science project
  • Manipulate, fix, and explore data in order to solve data science problems
  • Set up an experimental pipeline to test your data science hypotheses
  • Choose the most effective and scalable learning algorithm for your data science tasks
  • Optimize your machine learning models to get the best performance
  • Explore and cluster graphs, taking advantage of interconnections and links in your data

About the Author

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a PhD in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP), behavioral analysis, and machine learning to distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.

Luca Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight, with over a decade of experience of solving real-world problems and generating value for stakeholders by applying reasoning, statistics, data mining, and algorithms. From being a pioneer of web audience analysis in Italy to achieving the rank of a top ten Kaggler, he has always been very passionate about every aspect of data and its analysis, and also about demonstrating the potential of data-driven knowledge discovery to both experts and non-experts. Favoring simplicity over unnecessary sophistication, Luca believes that a lot can be achieved in data science just by doing the essentials.

Table of Contents

  1. First Steps
  2. Data Munging
  3. The Data Pipeline
  4. Machine Learning
  5. Social Network Analysis
  6. Visualization, Insights, and Results
  7. Strengthen Your Python Foundations

商品描述(中文翻譯)

《Python資料科學精要》第二版全面升級,帶領您掌握使用Python進行資料科學所需的一切知識。深入了解Python資料的核心,包括最新版本的Jupyter筆記本、NumPy、pandas和scikit-learn。透過Seaborn和ggplot進行美麗的資料視覺化,使用Bottle進行網頁開發,甚至探索Theano和TensorFlow等深度學習的新領域。

通過建立您的Python 3.5資料科學工具箱,使用單一來源方法,您可以同時使用Python 2.7進行工作。快速掌握資料整理和預處理,以及加載、分析和處理資料所需的所有技術。最後,全面瞭解主要的機器學習算法、圖形分析技術以及所有可視化和部署工具,使您能夠將結果呈現給資料科學專家和業務用戶。

關於作者:
- Alberto Boschetti是一位擁有信號處理和統計學專業知識的資料科學家。他擁有電信工程博士學位,目前居住和工作於倫敦。在他的工作項目中,他面臨著從自然語言處理(NLP)、行為分析到機器學習和分散處理等各種挑戰。他對自己的工作非常熱情,並且總是努力保持對最新的資料科學技術發展的了解,參加各種聚會、會議和其他活動。
- Luca Massaron是一位資料科學家和市場研究總監,專門從事多變量統計分析、機器學習和客戶洞察。他擁有十多年解決現實問題並通過推理、統計、數據挖掘和算法產生價值的經驗。從在意大利開創網絡觀眾分析的先河到成為前十名的Kaggler,他一直對數據的各個方面及其分析非常熱衷,並且致力於向專家和非專家展示數據驅動的知識發現的潛力。Luca認為,簡單而不是不必要的複雜性是重要的,他相信只要做好基本工作,就可以在資料科學領域取得很大的成就。

目錄:
1. 初步步驟
2. 資料整理
3. 資料管道
4. 機器學習
5. 社交網絡分析
6. 視覺化、洞察和結果
7. 加強Python基礎