Assessing and Improving Prediction and Classification: Theory and Algorithms in C++

Timothy Masters

  • 出版商: Apress
  • 出版日期: 2017-12-20
  • 售價: $3,240
  • 貴賓價: 9.5$3,078
  • 語言: 英文
  • 頁數: 517
  • 裝訂: Paperback
  • ISBN: 1484233352
  • ISBN-13: 9781484233351
  • 相關分類: C++ 程式語言Algorithms-data-structures
  • 海外代購書籍(需單獨結帳)

商品描述

Assess the quality of your prediction and classification models in ways that accurately reflect their real-world performance, and then improve this performance using state-of-the-art algorithms such as committee-based decision making, resampling the dataset, and boosting.  This book presents many important techniques for building powerful, robust models and quantifying their expected behavior when put to work in your application.

Considerable attention is given to information theory, especially as it relates to discovering and exploiting relationships between variables employed by your models.  This presentation of an often confusing subject avoids advanced mathematics, focusing instead on concepts easily understood by those with modest background in mathematics.

All algorithms include an intuitive explanation of operation, essential equations, references to more rigorous theory, and commented C++ source code.  Many of these techniques are recent developments, still not in widespread use.  Others are standard algorithms given a fresh look.  In every case, the emphasis is on practical applicability, with all code written in such a way that it can easily be included in any program.


What You'll Learn
  • Compute entropy to detect problematic predictors.
  • Compute confidence and tolerance intervals for predictions, as well as confidence levels for classification decisions.
  • Improve numeric predictions using constrained and unconstrained combinations, variance-weighted interpolation, and kernel-regression smoothing.
  • Improve classification decisions using Borda counts, MinMax and MaxMin rules, union and intersection rules, logistic regression, selection by local accuracy, maximization of the fuzzy integral, and pairwise coupling.
  • Use information-theoretic techniques to rapidly screen large numbers of candidate predictors, identifying those that are especially promising.
  • Use Monte-Carlo permutation methods to assess the role of good luck in performance results.


Who This Book is For

Anyone who creates prediction or classification models will find a wealth of useful algorithms in this book.  Although all code examples are written in C++, the algorithms are described in sufficient detail that they can easily be programmed in any language.

商品描述(中文翻譯)

評估您的預測和分類模型的品質,以準確反映其在現實世界中的表現,並使用委員會式決策、重新取樣數據集和增強等最新算法來提高性能。本書介紹了許多重要的技術,用於構建強大、穩健的模型,並量化它們在應用中的預期行為。

本書特別關注信息理論,尤其是與模型使用的變量之間的關係的發現和利用。這種對一個常常令人困惑的主題的介紹避免了高級數學,而是專注於那些對數學背景有限的人易於理解的概念。

所有算法都包括對操作的直觀解釋、基本方程、更嚴謹理論的參考以及帶有註釋的C++源代碼。其中許多技術是最近的發展,尚未廣泛使用。其他技術則是對標準算法的新的觀點。在每種情況下,重點都是實際應用性,所有代碼都以易於包含在任何程序中的方式編寫。

您將學到什麼:
- 使用熵來檢測問題預測變量。
- 計算預測的置信區間和容忍區間,以及分類決策的置信水平。
- 使用受限和非受限組合、方差加權插值和核回歸平滑來改進數值預測。
- 使用Borda計數、MinMax和MaxMin規則、聯合和交集規則、邏輯回歸、局部準確性選擇、模糊積分最大化和成對耦合來改進分類決策。
- 使用信息理論技術快速篩選大量候選預測變量,找出特別有潛力的變量。
- 使用蒙特卡羅置換方法評估運氣在性能結果中的作用。

本書適合對預測或分類模型進行建模的任何人。雖然所有代碼示例都是用C++編寫的,但算法的描述足夠詳細,可以輕鬆地在任何語言中編寫。