相關主題
商品描述
Online search engines are an essential tool for seeking information, but results returned from these search engines can contain undesirable forms of bias with respect to protected attributes such as gender or race. These biases can exist due to the word embeddings used by search engines, the design of re-ranking algorithms, the development of retrieval algorithms, or a variety of other reasons. Classical information retrieval (IR) methods, such as query recommendation or query expansion, were designed to produce the most relevant results. However, if such biases are present in the system, then these methods will also deliver biased results. IR systems/recommender systems also play a major role in social media algorithms, where platforms have pivoted away from friend-follow timelines to "for you" timelines containing algorithmically-selected content. If these algorithms are biased (towards, say, maximizing screen time to show ads, maximizing user interaction to likes, comments), then they may push end users towards clickbait or non-mainstream trending topics. This book presents an overview of modern IR and discusses the work done to mitigate biases in IR systems. It also examines methods for debiasing word embeddings and re-ranking search results to address group fairness, and presents a query reformulation method that analyzes bias in search results and delivers balanced results to the end user. Awareness of how information retrieval systems work, ways to mitigate bias in search results, and the tradeoffs between accuracy and bias metrics in search results will help readers understand real-world search engines.
商品描述(中文翻譯)
線上搜尋引擎是尋找資訊的重要工具,但這些搜尋引擎返回的結果可能會對性別或種族等受保護屬性存在不良的偏見。這些偏見可能源於搜尋引擎使用的詞嵌入、重新排序演算法的設計、檢索演算法的開發或其他多種原因。傳統的資訊檢索(IR)方法,如查詢推薦或查詢擴展,旨在產生最相關的結果。然而,如果系統中存在這些偏見,那麼這些方法也會提供有偏見的結果。
IR 系統/推薦系統在社交媒體演算法中也扮演著重要角色,這些平台已經從朋友追蹤的時間線轉向包含演算法選擇內容的「為你」時間線。如果這些演算法存在偏見(例如,為了最大化螢幕時間以顯示廣告,或最大化用戶互動以獲得讚、評論),那麼它們可能會將最終用戶推向點擊誘餌或非主流的熱門話題。
本書概述了現代資訊檢索的情況,並討論了減輕 IR 系統中偏見的工作。它還檢視了去偏見詞嵌入和重新排序搜尋結果以解決群體公平性的方法,並提出了一種查詢重構方法,該方法分析搜尋結果中的偏見並向最終用戶提供平衡的結果。
了解資訊檢索系統的運作方式、減輕搜尋結果偏見的方法,以及搜尋結果中準確性與偏見指標之間的權衡,將幫助讀者理解現實世界的搜尋引擎。
作者簡介
Harshit Mishra is a Ph.D. student in the Department of Electrical Engineering and Computer Science at Syracuse University. He holds a Master of Science degree in Computer Science from Syracuse University. His research interests include natural language processing, algorithmic fairness, network science, and AI for social good. Sucheta Soundarajan is an Associate Professor in the Department of Electrical Engineering and Computer Science at Syracuse University. She received her Ph.D. in Computer Science from Cornell University. Her research interests include the theory and applications of network science, algorithmic fairness, and AI in government.
作者簡介(中文翻譯)
Harshit Mishra 是雪城大學電機工程與計算機科學系的博士生。他擁有雪城大學計算機科學的碩士學位。他的研究興趣包括自然語言處理、算法公平性、網絡科學以及社會公益的人工智慧。 Sucheta Soundarajan 是雪城大學電機工程與計算機科學系的副教授。她在康奈爾大學獲得計算機科學的博士學位。她的研究興趣包括網絡科學的理論與應用、算法公平性以及政府中的人工智慧。