Data Labeling in Machine Learning with Python: Explore modern ways to prepare labeled data for training and fine-tuning ML and generative AI models
暫譯: 使用 Python 進行機器學習的數據標註:探索現代化的標註數據準備方法,以訓練和微調機器學習及生成式 AI 模型
Suda, Vijaya Kumar
- 出版商: Packt Publishing
- 出版日期: 2024-01-31
- 售價: $1,980
- 貴賓價: 9.5 折 $1,881
- 語言: 英文
- 頁數: 398
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1804610542
- ISBN-13: 9781804610541
-
相關分類:
Python、程式語言、人工智慧、Machine Learning
立即出貨 (庫存=1)
買這商品的人也買了...
相關主題
商品描述
Take your data preparation, machine learning, and GenAI skills to the next level by learning a range of Python algorithms and tools for data labeling
Key Features:
- Generate labels for regression in scenarios with limited training data
- Apply generative AI and large language models (LLMs) to explore and label text data
- Leverage Python libraries for image, video, and audio data analysis and data labeling
- Purchase of the print or Kindle book includes a free PDF eBook
Book Description:
Data labeling is the invisible hand that guides the power of artificial intelligence and machine learning. In today's data-driven world, mastering data labeling is not just an advantage, it's a necessity. Data Labeling in Machine Learning with Python empowers you to unearth value from raw data, create intelligent systems, and influence the course of technological evolution.
With this book, you'll discover the art of employing summary statistics, weak supervision, programmatic rules, and heuristics to assign labels to unlabeled training data programmatically. As you progress, you'll be able to enhance your datasets by mastering the intricacies of semi-supervised learning and data augmentation. Venturing further into the data landscape, you'll immerse yourself in the annotation of image, video, and audio data, harnessing the power of Python libraries such as seaborn, matplotlib, cv2, librosa, openai, and langchain. With hands-on guidance and practical examples, you'll gain proficiency in annotating diverse data types effectively.
By the end of this book, you'll have the practical expertise to programmatically label diverse data types and enhance datasets, unlocking the full potential of your data.
What You Will Learn:
- Excel in exploratory data analysis (EDA) for tabular, text, audio, video, and image data
- Understand how to use Python libraries to apply rules to label raw data
- Discover data augmentation techniques for adding classification labels
- Leverage K-means clustering to classify unsupervised data
- Explore how hybrid supervised learning is applied to add labels for classification
- Master text data classification with generative AI
- Detect objects and classify images with OpenCV and YOLO
- Uncover a range of techniques and resources for data annotation
Who this book is for:
This book is for machine learning engineers, data scientists, and data engineers who want to learn data labeling methods and algorithms for model training. Data enthusiasts and Python developers will be able to use this book to learn data exploration and annotation using Python libraries. Basic Python knowledge is beneficial but not necessary to get started.
商品描述(中文翻譯)
提升您的數據準備、機器學習和生成式人工智慧技能,學習一系列用於數據標註的 Python 演算法和工具
主要特色:
- 在訓練數據有限的情況下生成回歸標籤
- 應用生成式人工智慧和大型語言模型 (LLMs) 探索和標註文本數據
- 利用 Python 函式庫進行影像、視頻和音頻數據分析及數據標註
- 購買印刷版或 Kindle 書籍可獲得免費 PDF 電子書
書籍描述:
數據標註是引導人工智慧和機器學習力量的無形之手。在當今數據驅動的世界中,掌握數據標註不僅是一種優勢,更是一種必要性。《使用 Python 的機器學習數據標註》使您能夠從原始數據中挖掘價值,創建智能系統,並影響技術演進的方向。
在這本書中,您將發現使用摘要統計、弱監督、程式化規則和啟發式方法為未標註的訓練數據程式化分配標籤的藝術。隨著學習的深入,您將能夠通過掌握半監督學習和數據增強的複雜性來增強您的數據集。進一步探索數據領域,您將沉浸於影像、視頻和音頻數據的標註,利用 Python 函式庫如 seaborn、matplotlib、cv2、librosa、openai 和 langchain 的力量。通過實用的指導和實例,您將有效地掌握標註各種數據類型的技能。
在本書結束時,您將具備程式化標註各種數據類型和增強數據集的實用專業知識,釋放數據的全部潛力。
您將學到的內容:
- 在表格、文本、音頻、視頻和影像數據的探索性數據分析 (EDA) 中表現出色
- 了解如何使用 Python 函式庫應用規則來標註原始數據
- 發現用於添加分類標籤的數據增強技術
- 利用 K-means 聚類對無監督數據進行分類
- 探索混合監督學習如何應用於添加分類標籤
- 掌握使用生成式人工智慧進行文本數據分類
- 使用 OpenCV 和 YOLO 偵測物體和分類影像
- 揭示一系列數據標註的技術和資源
本書適合誰:
本書適合希望學習模型訓練的數據標註方法和演算法的機器學習工程師、數據科學家和數據工程師。數據愛好者和 Python 開發者也可以利用本書學習使用 Python 函式庫進行數據探索和標註。具備基本的 Python 知識會有幫助,但並非開始學習的必要條件。