Python Natural Language Processing Cookbook: Over 50 recipes to understand, analyze, and generate text for implementing language processing tasks

Antic, Zhenya

  • 出版商: Packt Publishing
  • 出版日期: 2021-03-19
  • 售價: $1,450
  • 貴賓價: 9.5$1,378
  • 語言: 英文
  • 頁數: 284
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1838987312
  • ISBN-13: 9781838987312
  • 相關分類: Python程式語言
  • 相關翻譯: Python 自然語言處理實戰 (簡中版)
  • 立即出貨 (庫存=1)

買這商品的人也買了...

商品描述

Get to grips with solving real-world NLP problems, such as dependency parsing, information extraction, topic modeling, and text data visualization


Key Features:

  • Analyze varying complexities of text using popular Python packages such as NLTK, spaCy, sklearn, and gensim
  • Implement common and not-so-common linguistic processing tasks using Python libraries
  • Overcome the common challenges faced while implementing NLP pipelines


Book Description:

Python is the most widely used language for natural language processing (NLP) thanks to its extensive tools and libraries for analyzing text and extracting computer-usable data. This book will take you through a range of techniques for text processing, from basics such as parsing the parts of speech to complex topics such as topic modeling, text classification, and visualization.


Starting with an overview of NLP, the book presents recipes for dividing text into sentences, stemming and lemmatization, removing stopwords, and parts of speech tagging to help you to prepare your data. You'll then learn ways of extracting and representing grammatical information, such as dependency parsing and anaphora resolution, discover different ways of representing the semantics using bag-of-words, TF-IDF, word embeddings, and BERT, and develop skills for text classification using keywords, SVMs, LSTMs, and other techniques. As you advance, you'll also see how to extract information from text, implement unsupervised and supervised techniques for topic modeling, and perform topic modeling of short texts, such as tweets. Additionally, the book shows you how to develop chatbots using NLTK and Rasa and visualize text data.


By the end of this NLP book, you'll have developed the skills to use a powerful set of tools for text processing.


What You Will Learn:

  • Become well-versed with basic and advanced NLP techniques in Python
  • Represent grammatical information in text using spaCy, and semantic information using bag-of-words, TF-IDF, and word embeddings
  • Perform text classification using different methods, including SVMs and LSTMs
  • Explore different techniques for topic modeling such as K-means, LDA, NMF, and BERT
  • Work with visualization techniques such as NER and word clouds for different NLP tools
  • Build a basic chatbot using NLTK and Rasa
  • Extract information from text using regular expression techniques and statistical and deep learning tools


Who this book is for:

This book is for data scientists and professionals who want to learn how to work with text. Intermediate knowledge of Python will help you to make the most out of this book. If you are an NLP practitioner, this book will serve as a code reference when working on your projects.

商品描述(中文翻譯)

深入解決現實世界的自然語言處理(NLP)問題,例如依存句法分析、信息提取、主題建模和文本數據可視化。

主要特點:
- 使用流行的Python包(如NLTK、spaCy、sklearn和gensim)分析不同複雜度的文本
- 使用Python庫實現常見和不太常見的語言處理任務
- 克服實施NLP流程時遇到的常見挑戰

書籍描述:
由於其廣泛的工具和庫用於分析文本和提取可用於計算機的數據,Python是自然語言處理(NLP)中最廣泛使用的語言。本書將介紹一系列文本處理技術,從基本的詞性分析到複雜的主題建模、文本分類和可視化等主題。

從NLP概述開始,本書提供了將文本分成句子、詞幹提取和詞形還原、去除停用詞和詞性標註等技巧,以幫助您準備數據。然後,您將學習提取和表示語法信息的方法,例如依存句法分析和指代消解,並探索使用詞袋模型、TF-IDF、詞嵌入和BERT等方法表示語義的不同方式。您還將開發使用關鍵詞、SVM、LSTM和其他技術進行文本分類的技能。隨著您的進一步學習,您還將了解如何從文本中提取信息,實施無監督和監督的主題建模技術,以及對短文本(如推文)進行主題建模。此外,本書還向您展示如何使用NLTK和Rasa開發聊天機器人並可視化文本數據。

通過閱讀本書,您將掌握使用強大的文本處理工具的技能。

學到什麼:
- 精通Python中基本和高級的NLP技術
- 使用spaCy表示文本中的語法信息,使用詞袋模型、TF-IDF和詞嵌入表示語義信息
- 使用不同方法進行文本分類,包括SVM和LSTM
- 探索不同的主題建模技術,如K-means、LDA、NMF和BERT
- 使用NER和詞雲等可視化技術來使用不同的NLP工具
- 使用NLTK和Rasa構建基本的聊天機器人
- 使用正則表達式技術和統計和深度學習工具從文本中提取信息

適合對象:
本書適合數據科學家和專業人士,他們想要學習如何處理文本。具備Python的中級知識將有助於您充分利用本書。如果您是NLP從業者,本書將在您的項目中作為代碼參考。