Counterexamples in Markov Decision Processes
暫譯: 馬可夫決策過程中的反例

Name: Counterexamples in Markov Decision Processes
Price: 6137 TWD
Availability: OnlineOnly
Author: Alexey Piunovskiy
ISBN: 1800616759

Alexey Piunovskiy

出版商: World Scientific Pub
出版日期: 2025-03-29
售價: $6,460
貴賓價: 9.5 折 $6,137
語言: 英文
頁數: 508
裝訂: Hardcover - also called cloth, retail trade, or trade
ISBN: 1800616759
ISBN-13: 9781800616752
相關分類: Reinforcement

海外代購書籍(需單獨結帳)

商品描述

Markov Decision Processes (MDPs) form a cornerstone of applied probability, with over 50 years of rich research history. Throughout this time, numerous foundational books and thousands of journal articles have shaped the field. The central objective of MDP theory is to identify the optimal control strategy for Markov random processes with discrete time. Interestingly, the best control strategies often display unexpected or counterintuitive behaviors, as documented by a wide array of studies.This book gathers some of the most compelling examples of such phenomena while introducing new ones. By doing so, it serves as a valuable companion to existing textbooks. While many examples require little to no prior knowledge, others delve into advanced topics and will primarily interest specialists.In this second edition, extensive revisions have been made, correcting errors and refining the content, with a wealth of new examples added. The range of examples spans from elementary to advanced, requiring background knowledge in areas like measure theory, convex analysis, and advanced probability. A new chapter on continuous time jump processes has also been introduced. The entire text has been reworked for clarity and accessibility.This book is an essential resource for active researchers and graduate students in the field of Markov Decision Processes.

商品描述(中文翻譯)

馬可夫決策過程（Markov Decision Processes, MDPs）是應用概率論的基石，擁有超過50年的豐富研究歷史。在這段時間裡，許多基礎書籍和數千篇期刊文章塑造了這個領域。MDP理論的核心目標是識別離散時間馬可夫隨機過程的最佳控制策略。有趣的是，最佳控制策略往往顯示出意想不到或反直覺的行為，這一點在各種研究中都有記錄。本書收集了一些最引人注目的此類現象的例子，同時介紹了新的例子。這樣做使其成為現有教科書的寶貴補充。雖然許多例子幾乎不需要先前的知識，但其他例子則深入探討高級主題，主要吸引專家。在第二版中，進行了廣泛的修訂，糾正了錯誤並精煉了內容，並新增了大量例子。例子的範圍從基礎到高級，要求具備測度論、凸分析和高級概率等領域的背景知識。還新增了一章關於連續時間跳躍過程的內容。整個文本已重新編寫，以提高清晰度和可讀性。本書是馬可夫決策過程領域中活躍研究者和研究生的重要資源。