Probabilistic Indexing for Information Search and Retrieval in Large Collections of Handwritten Text Images

Toselli, Alejandro Héctor, Puigcerver, Joan, Vidal, Enrique

  • 出版商: Springer
  • 出版日期: 2024-04-11
  • 售價: $5,900
  • 貴賓價: 9.5$5,605
  • 語言: 英文
  • 頁數: 344
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 3031553888
  • ISBN-13: 9783031553882
  • 海外代購書籍(需單獨結帳)

商品描述

This book provides a comprehensive presentation of a recently introduced framework, named "probabilistic indexing" (PrIx), for searching text in large collections of document images and other related applications. It fosters the development of new search engines for effective information retrieval from manuscripts which, however, lack the electronic text (transcripts) that would typically be required for such search and retrieval tasks.

The book is structured into 11 chapters and three appendices. The first two chapters briefly outline the necessary fundamentals and state of the art in pattern recognition, statistical decision theory, and handwritten text recognition. Chapter 3 presents approaches for indexing (as opposed to "spotting") each region of a handwritten text image which is likely to contain a word. Next, Chapter 4 describes models adopted for handwritten text in images, namely hidden Markov models, convolutional and recurrent neural networks and language models, and provides full details of weighted finite-state transducer (WFST) concepts and methods, needed in further chapters of the book. Chapter 5 explains the set of techniques and algorithms developed to generate image probabilistic indexes which allow for fast search and retrieval of textual information in the indexed images. Chapter 6 then presents experimental evaluations of the proposed framework and algorithms on different traditional benchmark datasets and compares them with other approaches, while Chapter 7 reviews the most popular keyword-spotting approaches. Chapter 8 explains how PrIx can support classical free-text search tools, while Chapter 9 presents new methods that use PrIx not only for searching, but also to deal with text analytics and other related natural language processing and information extraction tasks. Chapter 10 shows how the proposed solutions can be used to effectively index very large collections of handwritten document images, before Chapter 11 eventually summarizes the book and suggests promising lines of future research. The appendices detail the necessary mathematical foundations for the work and presents details of the text image collections and datasets used in the experiments throughout the book.

This book is written for researchers and (post-)graduate students in pattern recognition and information retrieval. It will also be of interest to people in areas like history, criminology, or psychology who need technical support to evaluate, understand or decode historical or contemporary handwritten text.

 

商品描述(中文翻譯)

這本書提供了一個全面的介紹,名為「概率索引」(PrIx)的最新框架,用於在大量文件圖像和其他相關應用中搜索文本。它促進了新型搜索引擎的開發,以從手稿中有效檢索信息,然而,這些手稿通常缺乏通常需要進行搜索和檢索任務的電子文本(轉錄)。

本書分為11章和三個附錄。前兩章簡要概述了模式識別、統計決策理論和手寫文本識別的必要基礎和最新技術。第三章介紹了對手寫文本圖像中可能包含單詞的每個區域進行索引(而不是“定位”)的方法。接下來,第四章描述了在圖像中採用的手寫文本模型,包括隱馬爾可夫模型、卷積和循環神經網絡和語言模型,並提供了在本書的後續章節中需要的加權有限狀態轉換器(WFST)概念和方法的詳細信息。第五章解釋了開發用於生成圖像概率索引的一組技術和算法,這些索引允許在索引圖像中快速搜索和檢索文本信息。然後,第六章在不同的傳統基準數據集上對所提出的框架和算法進行了實驗評估,並與其他方法進行了比較,而第七章則回顧了最流行的關鍵詞定位方法。第八章解釋了PrIx如何支持傳統的自由文本搜索工具,而第九章則介紹了使用PrIx不僅進行搜索,還用於處理文本分析和其他相關自然語言處理和信息提取任務的新方法。第十章展示了如何使用所提出的解決方案有效地索引非常大的手寫文件圖像集合,最後,第十一章總結了本書並提出了未來研究的有希望的方向。附錄詳細介紹了該工作所需的數學基礎,並介紹了本書中實驗中使用的文本圖像集合和數據集的詳細信息。

這本書是為模式識別和信息檢索的研究人員和(研究生)學生撰寫的。對於需要技術支持來評估、理解或解碼歷史或當代手寫文本的領域,如歷史學、犯罪學或心理學的人也會感興趣。

作者簡介

Alejandro Héctor Toselli, is currently working as a PostDoc (María Zambrano grant) at the Universitat Politècnica de València. He obtained an Electrical Engineer degree from the University Nacional de Tucumán (Argentina, 1997) and a Phd in Computer Science from the Universitat Politècnica de València (UPV) (Spain, 2004). His research expertise focuses primarily on Document Analysis and Recognition, in which he has more than 20 years of experience, publishing on these topics and working on related projects funded by European and US institutions. He held a Post-Doctoral Fellow at Northeastern University (Boston, USA) in the the multi-institutional Open Islamicate Texts Initiative (OpenITI) and at the "Institut de Recherche en Informatique et Systèmes Aléatoires" (IRISA, Rennes France).

Joan Puigcerver received his MSc and PhD in Computer Science from the Universitat Politècnica de València, in 2014 and 2018, respectively, focusing on probabilistic indexing and handwritten text recognition. In 2018, he joined Google Research as a software engineer. His research focuses on deep learning architectures, transfer learning, and computer vision. Joan is a member of the Spanish Society for Pattern Recognition and Image Analysis (AERFAI), an affiliate organization of the International Association for Pattern Recognition (IAPR).

Enrique Vidal is an emeritus professor of the Universitat Politècnica de València (Spain) and former co-leader of the PRHLT research center there. He is co-author of hundreds of research papers in the fields of Pattern Recognition, Multimodal Interaction and applications to Language, Speech and Image Processing and has led many important projects in these fields. Enrique is a fellow of the International Association for Pattern Recognition (IAPR).

 

作者簡介(中文翻譯)

Alejandro Héctor Toselli目前在瓦倫西亞理工大學擔任博士後研究員(María Zambrano獎學金)。他於1997年在國立圖庫曼大學(阿根廷)獲得電氣工程學位,並於2004年在瓦倫西亞理工大學(西班牙)獲得計算機科學博士學位。他的研究專長主要集中在文件分析和識別領域,擁有超過20年的經驗,在這些領域發表了多篇論文,並參與了由歐洲和美國機構資助的相關項目。他曾在美國波士頓東北大學的多機構Open Islamicate Texts Initiative(OpenITI)和法國雷恩的'Institut de Recherche en Informatique et Systèmes Aléatoires'(IRISA)擔任博士後研究員。

Joan Puigcerver於2014年和2018年分別在瓦倫西亞理工大學獲得計算機科學碩士和博士學位,專注於概率索引和手寫文本識別。2018年,他加入Google Research擔任軟體工程師。他的研究重點在於深度學習架構、遷移學習和計算機視覺。Joan是西班牙模式識別和圖像分析學會(AERFAI)的成員,該學會是國際模式識別協會(IAPR)的附屬組織。

Enrique Vidal是瓦倫西亞理工大學(西班牙)的名譽教授,曾是PRHLT研究中心的聯合領導人。他是模式識別、多模態交互和語言、語音和圖像處理應用領域的數百篇研究論文的合著者,並在這些領域領導了許多重要項目。Enrique是國際模式識別協會(IAPR)的會士。