Natural Language Processing with Spark Nlp: Learning to Understand Text at Scale

Thomas, Alex

  • 出版商: O'Reilly
  • 出版日期: 2020-08-04
  • 售價: $2,720
  • 貴賓價: 9.5$2,584
  • 語言: 英文
  • 頁數: 350
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1492047767
  • ISBN-13: 9781492047766
  • 相關分類: SparkText-mining
  • 海外代購書籍(需單獨結帳)

買這商品的人也買了...

商品描述

Want to build an application that uses natural language text, but aren't sure where to start or what tools to use? This practical book gets you started with natural language processing from the basics to powerful modern techniques. Data scientists will learn how to build enterprise-quality NLP applications using deep learning and the Apache Spark distributed processing framework.

This guide includes concrete examples, practical and theoretical explanations, and hands-on exercises for NLP on Spark. You'll understand why these techniques work from machine learning, linguistic, and practical points of view.

This book shows you how to:

  • Process text in a distributed environment using Spark-NLP, a production-ready library for NLP built on Spark
  • Create, tune, and deploy your own word embeddings
  • Adapt your NLP applications to multiple languages
  • Use text in machine learning and deep learning

商品描述(中文翻譯)

想要建立一個使用自然語言文本的應用程式,但不確定從何處開始或使用哪些工具嗎?這本實用的書籍將從基礎到強大的現代技術,引導您進入自然語言處理的世界。資料科學家將學習如何使用深度學習和Apache Spark分散處理框架來建立企業級的NLP應用程式。

本指南包含具體的例子、實用和理論解釋,以及NLP在Spark上的實踐練習。您將從機器學習、語言學和實踐角度了解這些技術為何有效。

本書將教您如何:
- 使用Spark-NLP在分散環境中處理文本,Spark-NLP是一個基於Spark的生產就緒NLP庫
- 創建、調整和部署自己的詞嵌入
- 將您的NLP應用程式適應多種語言
- 在機器學習和深度學習中使用文本

作者簡介

Alex Thomas is a data scientist at Indeed. He has used natural language processing (NLP) and machine learning with clinical data, identity data, and now employer and jobseeker data. He has worked with Apache Spark since version 0.9, and has worked with NLP libraries and frameworks including UIMA and OpenNLP.

作者簡介(中文翻譯)

Alex Thomas 是 Indeed 的一位資料科學家。他曾使用自然語言處理(NLP)和機器學習來處理臨床數據、身份數據,現在則是雇主和求職者數據。他從 Apache Spark 的 0.9 版本開始使用,並且使用過包括 UIMA 和 OpenNLP 在內的 NLP 函式庫和框架。