Pretrained Transformers for Text Ranking: Bert and Beyond
暫譯: 預訓練變壓器於文本排名的應用:Bert 及其他技術
Lin, Jimmy, Nogueira, Rodrigo, Yates, Andrew
- 出版商: Morgan & Claypool
- 出版日期: 2021-10-29
- 售價: $3,190
- 貴賓價: 9.5 折 $3,031
- 語言: 英文
- 頁數: 325
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1636392288
- ISBN-13: 9781636392288
海外代購書籍(需單獨結帳)
相關主題
商品描述
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing (NLP) applications. This book provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example. The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in NLP, information retrieval (IR), and beyond.
This book provides a synthesis of existing work as a single point of entry for practitioners who wish to gain a better understanding of how to apply transformers to text ranking problems and researchers who wish to pursue work in this area. It covers a wide range of modern techniques, grouped into two high-level categories: transformer models that perform reranking in multi-stage architectures and dense retrieval techniques that perform ranking directly. Two themes pervade the book: techniques for handling long documents, beyond typical sentence-by-sentence processing in NLP, and techniques for addressing the tradeoff between effectiveness (i.e., result quality) and efficiency (e.g., query latency, model and index size). Although transformer architectures and pretraining techniques are recent innovations, many aspects of how they are applied to text ranking are relatively well understood and represent mature techniques. However, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, this book also attempts to prognosticate where the field is heading.
商品描述(中文翻譯)
文本排名的目標是生成一個有序的文本列表,這些文本是從語料庫中根據查詢檢索而來。雖然文本排名最常見的表述是搜索,但在許多自然語言處理(NLP)應用中也可以找到這項任務的實例。本書提供了有關文本排名的概述,重點介紹了被稱為變壓器(transformers)的神經網絡架構,其中 BERT(Bidirectional Encoder Representations from Transformers)是最著名的例子。變壓器與自我監督預訓練的結合,促成了 NLP、信息檢索(IR)及其他領域的範式轉變。
本書綜合了現有的研究成果,為希望更好地理解如何將變壓器應用於文本排名問題的從業者以及希望在此領域進行研究的學者提供了一個單一的切入點。它涵蓋了廣泛的現代技術,分為兩個高層次的類別:在多階段架構中執行重新排名的變壓器模型,以及直接執行排名的密集檢索技術。本書貫穿兩個主題:處理長文檔的技術,超越 NLP 中典型的逐句處理,以及解決效果(即結果質量)與效率(例如查詢延遲、模型和索引大小)之間的權衡的技術。儘管變壓器架構和預訓練技術是近期的創新,但它們在文本排名中的應用方式的許多方面相對較為成熟,並代表了成熟的技術。然而,仍然存在許多未解的研究問題,因此除了奠定預訓練變壓器在文本排名中的基礎外,本書還試圖預測該領域的未來發展方向。