Apache Solr for Indexing Data
暫譯: Apache Solr 數據索引指南

Sachin Handiekar, Anshul Johri

  • 出版商: Packt Publishing
  • 出版日期: 2015-12-22
  • 售價: $1,670
  • 貴賓價: 9.5$1,587
  • 語言: 英文
  • 頁數: 160
  • 裝訂: Paperback
  • ISBN: 1783553235
  • ISBN-13: 9781783553235
  • 相關分類: 全文搜尋引擎 Full-text-search
  • 海外代購書籍(需單獨結帳)

買這商品的人也買了...

相關主題

商品描述

Enhance your Solr indexing experience with advanced techniques and the built-in functionalities available in Apache Solr

About This Book

  • Learn about distributed indexing and real-time optimization to change index data on fly
  • Index data from various sources and web crawlers using built-in analyzers and tokenizers
  • This step-by-step guide is packed with real-life examples on indexing data

Who This Book Is For

This book is for developers who want to increase their experience of indexing in Solr by learning about the various index handlers, analyzers, and methods available in Solr. Beginner level Solr development skills are expected.

What You Will Learn

  • Get to know the basic features of Solr indexing and the analyzers/tokenizers available
  • Index XML/JSON data in Solr using the HTTP Post tool and CURL command
  • Work with Data Import Handler to index data from a database
  • Use Apache Tika with Solr to index word documents, PDFs, and much more
  • Utilize Apache Nutch and Solr integration to index crawled data from web pages
  • Update indexes in real-time data feeds
  • Discover techniques to index multi-language and distributed data in Solr
  • Combine the various indexing techniques into a real-life working example of an online shopping web application

In Detail

Apache Solr is a widely used, open source enterprise search server that delivers powerful indexing and searching features. These features help fetch relevant information from various sources and documentation. Solr also combines with other open source tools such as Apache Tika and Apache Nutch to provide more powerful features.

This fast-paced guide starts by helping you set up Solr and get acquainted with its basic building blocks, to give you a better understanding of Solr indexing. You'll quickly move on to indexing text and boosting the indexing time. Next, you'll focus on basic indexing techniques, various index handlers designed to modify documents, and indexing a structured data source through Data Import Handler.

Moving on, you will learn techniques to perform real-time indexing and atomic updates, as well as more advanced indexing techniques such as de-duplication. Later on, we'll help you set up a cluster of Solr servers that combine fault tolerance and high availability. You will also gain insights into working scenarios of different aspects of Solr and how to use Solr with e-commerce data.

By the end of the book, you will be competent and confident working with indexing and will have a good knowledge base to efficiently program elements.

Style and approach

This fast-paced guide is packed with examples that are written in an easy-to-follow style, and are accompanied by detailed explanation. Working examples are included to help you get better results for your applications.

商品描述(中文翻譯)

透過進階技術和 Apache Solr 中內建的功能來增強您的 Solr 索引體驗

本書簡介


  • 了解分散式索引和即時優化,以便即時更改索引數據

  • 使用內建的分析器和標記器從各種來源和網路爬蟲索引數據

  • 這本逐步指南充滿了有關索引數據的實際範例

本書適合誰閱讀

本書適合希望透過學習 Solr 中各種索引處理器、分析器和方法來提升其在 Solr 中索引經驗的開發人員。預期讀者具備初級的 Solr 開發技能。

您將學到什麼


  • 了解 Solr 索引的基本功能及可用的分析器/標記器

  • 使用 HTTP Post 工具和 CURL 命令在 Solr 中索引 XML/JSON 數據

  • 使用數據導入處理器從數據庫索引數據

  • 將 Apache Tika 與 Solr 結合,索引 Word 文件、PDF 及更多

  • 利用 Apache Nutch 和 Solr 的整合來索引從網頁爬取的數據

  • 在即時數據流中更新索引

  • 探索在 Solr 中索引多語言和分散式數據的技術

  • 將各種索引技術結合成一個實際運作的線上購物網應用範例

詳細內容

Apache Solr 是一個廣泛使用的開源企業搜尋伺服器,提供強大的索引和搜尋功能。這些功能有助於從各種來源和文檔中提取相關信息。Solr 還與其他開源工具如 Apache Tika 和 Apache Nutch 結合,以提供更強大的功能。

這本快速入門指南首先幫助您設置 Solr 並熟悉其基本組件,以便更好地理解 Solr 索引。您將迅速進入文本索引並提升索引速度。接下來,您將專注於基本索引技術、各種設計用於修改文檔的索引處理器,以及通過數據導入處理器索引結構化數據源。

接下來,您將學習執行即時索引和原子更新的技術,以及更進階的索引技術,如去重。稍後,我們將幫助您設置一組結合容錯和高可用性的 Solr 伺服器集群。您還將深入了解 Solr 的不同方面的工作場景,以及如何將 Solr 與電子商務數據結合使用。

在本書結束時,您將能夠自信地處理索引工作,並擁有良好的知識基礎以有效編程元素。

風格與方法

這本快速入門指南充滿了易於理解的範例,並附有詳細的解釋。包含的實作範例將幫助您為應用程式獲得更好的結果。