Apache Solr 5.x Beginner's Guide, 2/e(Paperback)
暫譯: Apache Solr 5.x 初學者指南,第2版(平裝本)
Alfredo Serafini
- 出版商: Packt Publishing
- 出版日期: 2017-11-06
- 售價: $2,050
- 貴賓價: 9.5 折 $1,948
- 語言: 英文
- 頁數: 447
- 裝訂: Paperback
- ISBN: 1785282433
- ISBN-13: 9781785282430
-
相關分類:
全文搜尋引擎 Full-text-search
海外代購書籍(需單獨結帳)
相關主題
商品描述
Build and configure your own search engine using Apache Solr 5.X
About This Book
- Use Apache Solr to build and customize Solr/Lucene based search solutions using indexing
- Apache Solr provides you with a powerful, faceted, geo-spatial text search, along with rich document handling
- This is a hands-on guide packed with real-world implementations of the Solr search feature
Who This Book Is For
This book is an ideal starting point for anyone who wants to embed a search engine in their site.
In particular, if you are a data architect or a project manager and you need to make some key design decisions, every example included is applicable in real-world contexts.
If you are a Java developer who would like to start using Apache Solr to build and customize Solr/Lucene based search-solutions, then this is the book for you.
What You Will Learn
- Define a simple and effective full-text search
- Write configurations incrementally and test them with the Solr web UI or CURL
- Get acquainted with the logical structure of an Inverted Index
- Understand how to use the text analysis chain and customize searches for different use cases
- Use faceted search, simple analytics, or data clustering to enhance users' search experience
- Import data from various sources (including XML and databases), clean or expand it with scripting, and expose it it using several formats such as CSV, JSON, and XML
- Use Solr UI for simple maintenance tasks
In Detail
Apache Solr is a standalone enterprise search server, exposing services for advanced text search, spatial search, faceted search, and analytics. Solr's architecture is very fast and scalable, from working prototypes to complex distributed architecture; the internal workflow is also open to components' customization, and integration with external tools for advanced text analysis.
This book is a practical introduction to the Solr platform that shows you how to configure your own search engine experience and embed a search engine in your website to help users navigate the data.
We start with the basics of how to use Solr and perform indexing on the default installation. You'll be introduced to the workings of the Solr schema API, the structure of an inverted index, text analysis, and the concept of similarity. Next, we demonstrate indexing and searching with some sample data.
Moving on, you'll learn how to use a faceted search and work with multiple entities and multicores, and how to index external data sources such as open source datasets. You'll get to grips with basic SolrCloud concepts such as routing / shard splitting, Zookeeper, and clustering Solr for distributed searches using SolrCloud. You'll also learn how to detect language with Tika and LangDetect.
At the end of the book, we create a project on a site for bookcrossing, which puts all the concepts together to give you the bigger picture.
商品描述(中文翻譯)
**建立和配置您自己的搜尋引擎,使用 Apache Solr 5.X**
## 本書介紹
- 使用 Apache Solr 建立和自訂基於 Solr/Lucene 的搜尋解決方案,並進行索引
- Apache Solr 為您提供強大的多面向、地理空間文本搜尋,並具備豐富的文件處理功能
- 這是一本實用指南,包含了 Solr 搜尋功能的真實世界實作
## 本書適合誰
這本書是任何希望在其網站中嵌入搜尋引擎的人的理想起點。
特別是,如果您是數據架構師或專案經理,需要做出一些關鍵設計決策,書中每個範例都適用於真實世界的情境。
如果您是希望開始使用 Apache Solr 來建立和自訂基於 Solr/Lucene 的搜尋解決方案的 Java 開發者,那麼這本書就是為您而寫。
## 您將學到什麼
- 定義一個簡單而有效的全文搜尋
- 逐步撰寫配置並使用 Solr 網頁介面或 CURL 測試它們
- 熟悉倒排索引的邏輯結構
- 了解如何使用文本分析鏈並為不同的使用案例自訂搜尋
- 使用多面向搜尋、簡單分析或數據聚類來增強用戶的搜尋體驗
- 從各種來源(包括 XML 和資料庫)導入數據,使用腳本清理或擴展數據,並以 CSV、JSON 和 XML 等多種格式公開數據
- 使用 Solr UI 進行簡單的維護任務
## 詳細內容
Apache Solr 是一個獨立的企業搜尋伺服器,提供高級文本搜尋、空間搜尋、多面向搜尋和分析的服務。Solr 的架構非常快速且可擴展,從工作原型到複雜的分散式架構;內部工作流程也開放給組件的自訂,並可與外部工具整合以進行高級文本分析。
這本書是對 Solr 平台的實用介紹,展示了如何配置您自己的搜尋引擎體驗,並在您的網站中嵌入搜尋引擎,以幫助用戶導航數據。
我們將從如何使用 Solr 和在預設安裝上執行索引的基本知識開始。您將了解 Solr schema API 的運作、倒排索引的結構、文本分析和相似度的概念。接下來,我們將使用一些範例數據演示索引和搜尋。
接下來,您將學習如何使用多面向搜尋,處理多個實體和多核心,以及如何索引外部數據來源,例如開源數據集。您將掌握基本的 SolrCloud 概念,如路由/分片、Zookeeper 和使用 SolrCloud 進行分散式搜尋的 Solr 聚類。您還將學習如何使用 Tika 和 LangDetect 偵測語言。
在書的最後,我們將在一個書籍交換網站上創建一個專案,將所有概念整合在一起,讓您獲得更全面的理解。