Productive and Efficient Data Science with Python: Best Practices Guide to Implementing Aiops

Sarkar, Tirthajyoti

  • 出版商: Apress
  • 出版日期: 2022-07-02
  • 售價: $2,300
  • 貴賓價: 9.5$2,185
  • 語言: 英文
  • 頁數: 290
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1484281209
  • ISBN-13: 9781484281208
  • 相關分類: Python程式語言Data Science
  • 海外代購書籍(需單獨結帳)

商品描述

Chapter 1: What is Productive and Efficient Data Science?Chapter Goal: To introduce the readers with the concept of doing data science tasks efficiently and more productively and illustrating potential pitfalls in their everyday work.No of pages - 10Subtopics- Typical data science pipeline- Short examples of inefficient programming in data science- Some pitfalls to avoid- Efficiency and productivity go hand in hand- Overview of tools and techniques for a productive data science pipeline- Skills and attitude for productive data science
Chapter 2: Better Programming Principles for Efficient Data ScienceChapter Goal: Help readers grasp the idea of efficient programming techniques and how they can be applied to a typical data science task flow.No of pages - 15Subtopics- The concept of time and space complexity, Big-O notation- Why complexity matters for data science- Examples of inefficient programming in data science tasks- What you can do instead- Measuring code execution timing
Chapter 3: How to Use Python Data Science Packages more ProductivelyChapter Goal: Illustrate handful of tricks and techniques to use the most well-known Python data science packages - Numpy, Pandas, Matplotlib, Seaborn, Scipy - more productively.No of pages - 20Subtopics- Why Numpy is faster than regular Python code and how much- Using Numpy efficiently- Using Pandas productively- Matplotlib and Seaborn code for and productive EDA- Using SciPy for common data science tasks
Chapter 4: Writing Machine Learning Code More ProductivelyChapter Goal: Teach the reader about writing efficient and modular machine learning code for productive data science pipeline with hands-on examples using Scikit-learn.No of pages - 15Subtopics- Why modular code for machine learning and deep learning- Scikit-learn tools and techniques- Systematic evaluation of Scikit-learn ML algorithms in automated fashion- Decision boundary visualization with custom function- Hyperparameter search in Scikit-learn
Chapter 5: Modular and Productive Deep Learning CodeChapter Goal: Teach the reader about mixing modular programming style in deep learning code with hands-on examples using Keras/TensorFlow.No of pages - 25Subtopics- Why modular code and object-oriented style for deep learning- Wrapper functions with Keras for faster deep learning experimentations- A single function to streamline image classification task flow- Visualize activation functions of neural networks- Custom callback functions in Keras and their utilities- Using Scikit-learn wrapper for hyperparameter search in Keras
Chapter 6: Build Your Own Machine Learning Estimator/PackageChapter Goal: Illustrate how to build a new Python machine learning module/package from scratch.No of pages - 15Subtopics- Why write your own ML package/module?- A simple example vs. a data scientist's example- A good, old Linear Regression estimator - with a twist- How do you start building?- Add utility functions- Do more with object-oriented approach
Chapter 7: Some Cool Utility PackagesChapter Goal: Introduce the readers to the idea of executing data science tasks efficiently by going beyond traditional stack and utilizing exciting, new libraries.No of pages - 20Subtopics- The great Python

商品描述(中文翻譯)

第一章:什麼是高效率的數據科學?
章節目標:介紹讀者高效率地進行數據科學任務的概念,並說明日常工作中可能遇到的潛在問題。
頁數:10頁
子主題:
- 典型的數據科學流程
- 數據科學中低效編程的簡單示例
- 避免的一些陷阱
- 效率和生產力相輔相成
- 提高數據科學流程效率的工具和技術概述
- 提高數據科學生產力的技能和態度

第二章:提高數據科學效率的更好編程原則
章節目標:幫助讀者理解高效編程技巧的概念,以及如何應用於典型的數據科學任務流程。
頁數:15頁
子主題:
- 時間和空間複雜度的概念,大O符號
- 為什麼複雜度對於數據科學很重要
- 數據科學任務中低效編程的示例
- 替代方案
- 測量代碼執行時間

第三章:更高效地使用Python數據科學套件
章節目標:通過實際示例,展示如何更高效地使用最著名的Python數據科學套件 - Numpy、Pandas、Matplotlib、Seaborn、Scipy。
頁數:20頁
子主題:
- 為什麼Numpy比常規Python代碼更快,以及差距有多大
- 高效使用Numpy
- 生產性使用Pandas
- Matplotlib和Seaborn代碼用於生產性探索性數據分析
- 使用SciPy進行常見數據科學任務

第四章:更高效地編寫機器學習代碼
章節目標:通過使用Scikit-learn的實際示例,教讀者如何編寫高效且模塊化的機器學習代碼,以構建生產性的數據科學流程。
頁數:15頁
子主題:
- 為什麼機器學習和深度學習需要模塊化代碼
- Scikit-learn的工具和技術
- 自動化方式系統性評估Scikit-learn機器學習算法
- 使用自定義函數可視化決策邊界
- 在Scikit-learn中進行超參數搜索

第五章:模塊化和高效的深度學習代碼
章節目標:通過使用Keras/TensorFlow的實際示例,教讀者如何在深度學習代碼中混合模塊化編程風格,以構建生產性的數據科學流程。
頁數:25頁
子主題:
- 為什麼深度學習需要模塊化代碼和面向對象的風格
- 使用Keras的包裝函數進行更快的深度學習實驗
- 一個函數來簡化圖像分類任務流程
- 可視化神經網絡的激活函數
- Keras中自定義回調函數及其實用工具
- 在Keras中使用Scikit-learn包裝器進行超參數搜索

第六章:構建自己的機器學習估計器/套件
章節目標:示範如何從頭開始構建一個新的Python機器學習模塊/套件。
頁數:15頁
子主題:
- 為什麼要自己編寫機器學習套件/模塊?
- 簡單示例與數據科學家的示例對比
- 簡單而又經典的線性回歸估計器 - 加入一些新元素
- 如何開始構建?
- 添加實用函數
- 以面向對象的方式進一步擴展

第七章:一些很酷的實用套件
章節目標:通過超越傳統堆棧並利用令人興奮的新庫,介紹讀者如何高效執行數據科學任務。
頁數:20頁
子主題:
- 優秀的Python套件

作者簡介

Dr. Tirthajyoti Sarkar lives in the San Francisco Bay area works as a Data Science and Solutions Engineering Manager at Adapdix Corp., where he architects Artificial intelligence and Machine learning solutions for edge-computing based systems powering the Industry 4.0 and Smart manufacturing revolution across a wide range of industries. Before that, he spent more than a decade developing best-in-class semiconductor technologies for power electronics.
He has published data science books, and regularly contributes highly cited AI/ML-related articles on top platforms such as KDNuggets and Towards Data Science. Tirthajyoti has developed multiple open-source software packages in the field of statistical modeling and data analytics. He has 5 US patents and more than thirty technical publications in international journals and conferences.
He conducts regular workshops and participates in expert panels on various AI/ML topics and contributes to the broader data science community in numerous ways. Tirthajyoti holds a Ph.D. from the University of Illinois and a B.Tech degree from the Indian Institute of Technology, Kharagpur.

作者簡介(中文翻譯)

Tirthajyoti Sarkar 博士居住在舊金山灣區,目前在 Adapdix Corp. 擔任資料科學和解決方案工程經理,他在那裡設計了基於邊緣運算的系統的人工智慧和機器學習解決方案,推動了工業 4.0 和智能製造革命在各個行業的應用。在此之前,他花了十多年的時間開發了領先的功率電子半導體技術。

他已經出版了數據科學書籍,並定期在 KDNuggets 和 Towards Data Science 等頂尖平台上發表了引用率很高的人工智慧/機器學習相關文章。Tirthajyoti 在統計建模和數據分析領域開發了多個開源軟件包。他擁有 5 項美國專利和超過三十篇國際期刊和會議的技術論文。

他定期舉辦研討會,參與專家小組討論各種人工智慧/機器學習主題,並以多種方式貢獻於更廣泛的數據科學社區。Tirthajyoti 擁有來自伊利諾伊大學的博士學位和印度理工學院卡拉格普爾分校的學士學位。