Validity, Reliability, and Significance: Empirical Methods for NLP and Data Science
暫譯: 有效性、可靠性與顯著性:自然語言處理與數據科學的實證方法

Riezler, Stefan, Hagmann, Michael

  • 出版商: Morgan & Claypool
  • 出版日期: 2021-12-03
  • 售價: $2,550
  • 貴賓價: 9.5$2,423
  • 語言: 英文
  • 頁數: 165
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1636392717
  • ISBN-13: 9781636392714
  • 相關分類: Text-miningData Science
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Empirical methods are means to answering methodological questions of empirical sciences by statistical techniques. The methodological questions addressed in this book include the problems of validity, reliability, and significance. In the case of machine learning, these correspond to the questions of whether a model predicts what it purports to predict, whether a model's performance is consistent across replications, and whether a performance difference between two models is due to chance, respectively. The goal of this book is to answer these questions by concrete statistical tests that can be applied to assess validity, reliability, and significance of data annotation and machine learning prediction in the fields of NLP and data science.

Our focus is on model-based empirical methods where data annotations and model predictions are treated as training data for interpretable probabilistic models from the well-understood families of generalized additive models (GAMs) and linear mixed effects models (LMEMs). Based on the interpretable parameters of the trained GAMs or LMEMs, the book presents model-based statistical tests such as a validity test that allows detecting circular features that circumvent learning. Furthermore, the book discusses a reliability coefficient using variance decomposition based on random effect parameters of LMEMs. Last, a significance test based on the likelihood ratio of nested LMEMs trained on the performance scores of two machine learning models is shown to naturally allow the inclusion of variations in meta-parameter settings into hypothesis testing, and further facilitates a refined system comparison conditional on properties of input data.

This book can be used as an introduction to empirical methods for machine learning in general, with a special focus on applications in NLP and data science. The book is self-contained, with an appendix on the mathematical background on GAMs and LMEMs, and with an accompanying webpage including R code to replicate experiments presented in the book.

商品描述(中文翻譯)

經驗方法是透過統計技術回答實證科學方法論問題的手段。本書所探討的方法論問題包括有效性、可靠性和顯著性等問題。在機器學習的情況下,這些問題分別對應於模型是否能預測其所聲稱的預測內容、模型的表現是否在重複實驗中保持一致,以及兩個模型之間的表現差異是否由於隨機因素造成。這本書的目標是通過具體的統計測試來回答這些問題,以評估在自然語言處理(NLP)和數據科學領域中數據標註和機器學習預測的有效性、可靠性和顯著性。

我們的重點是基於模型的經驗方法,其中數據標註和模型預測被視為來自廣義加性模型(GAMs)和線性混合效應模型(LMEMs)這些已被充分理解的模型家族的可解釋概率模型的訓練數據。基於訓練後的GAMs或LMEMs的可解釋參數,本書提出了基於模型的統計測試,例如有效性測試,該測試可以檢測繞過學習的循環特徵。此外,本書還討論了基於LMEMs隨機效應參數的方差分解的可靠性係數。最後,基於嵌套LMEMs的似然比的顯著性測試顯示,自然允許將元參數設置的變化納入假設檢驗中,並進一步促進了基於輸入數據特性的精細系統比較。

本書可以作為機器學習經驗方法的一個入門,特別關注於在NLP和數據科學中的應用。本書是自足的,附錄中包含了GAMs和LMEMs的數學背景,並附有一個網頁,包括R代碼以重現書中呈現的實驗。