Reinforcement Learning and Dynamic Programming Using Function Approximators
暫譯: 使用函數逼近器的強化學習與動態規劃

Busoniu, Lucian, Babuska, Robert, de Schutter, Bart

相關主題

商品描述

From household appliances to applications in robotics, engineered systems involving complex dynamics can only be as effective as the algorithms that control them. While Dynamic Programming (DP) has provided researchers with a way to optimally solve decision and control problems involving complex dynamic systems, its practical value was limited by algorithms that lacked the capacity to scale up to realistic problems.

However, in recent years, dramatic developments in Reinforcement Learning (RL), the model-free counterpart of DP, changed our understanding of what is possible. Those developments led to the creation of reliable methods that can be applied even when a mathematical model of the system is unavailable, allowing researchers to solve challenging control problems in engineering, as well as in a variety of other disciplines, including economics, medicine, and artificial intelligence.

Reinforcement Learning and Dynamic Programming Using Function Approximators provides a comprehensive and unparalleled exploration of the field of RL and DP. With a focus on continuous-variable problems, this seminal text details essential developments that have substantially altered the field over the past decade. In its pages, pioneering experts provide a concise introduction to classical RL and DP, followed by an extensive presentation of the state-of-the-art and novel methods in RL and DP with approximation. Combining algorithm development with theoretical guarantees, they elaborate on their work with illustrative examples and insightful comparisons. Three individual chapters are dedicated to representative algorithms from each of the major classes of techniques: value iteration, policy iteration, and policy search. The features and performance of these algorithms are highlighted in extensive experimental studies on a range of control applications.

The recent development of applications involving complex systems has led to a surge of interest in RL and DP methods and the subsequent need for a quality resource on the subject. For graduate students and others new to the field, this book offers a thorough introduction to both the basics and emerging methods. And for those researchers and practitioners working in the fields of optimal and adaptive control, machine learning, artificial intelligence, and operations research, this resource offers a combination of practical algorithms, theoretical analysis, and comprehensive examples that they will be able to adapt and apply to their own work.

Access the authors' website at www.dcsc.tudelft.nl/rlbook/ for additional material, including computer code used in the studies and information concerning new developments.

商品描述(中文翻譯)

從家用電器到機器人應用,涉及複雜動態的工程系統的效能僅能依賴於控制它們的演算法。雖然動態規劃(Dynamic Programming, DP)為研究人員提供了一種最佳解決涉及複雜動態系統的決策和控制問題的方法,但其實際價值受到缺乏擴展到現實問題能力的演算法的限制。

然而,近年來,強化學習(Reinforcement Learning, RL)的劇烈發展,作為DP的無模型對應方法,改變了我們對可能性的理解。這些發展導致了可靠方法的創建,即使在系統的數學模型不可用的情況下也能應用,使研究人員能夠解決工程以及經濟學、醫學和人工智慧等多個學科中的挑戰性控制問題。

《使用函數逼近器的強化學習與動態規劃(Reinforcement Learning and Dynamic Programming Using Function Approximators)》提供了對RL和DP領域的全面且無與倫比的探索。該書專注於連續變量問題,詳細介紹了過去十年中顯著改變該領域的基本發展。在書中,開創性的專家們對經典的RL和DP進行了簡明的介紹,隨後廣泛呈現了RL和DP中使用逼近的最先進和新穎的方法。結合演算法開發與理論保證,他們通過示例和深入的比較詳細闡述了他們的工作。三個獨立的章節專門介紹了每個主要技術類別的代表性演算法:值迭代(value iteration)、策略迭代(policy iteration)和策略搜尋(policy search)。這些演算法的特性和性能在一系列控制應用的廣泛實驗研究中得到了強調。

最近涉及複雜系統的應用發展引發了對RL和DP方法的興趣激增,隨之而來的是對該主題高品質資源的需求。對於研究生和其他新進入該領域的人士,本書提供了對基本知識和新興方法的徹底介紹。而對於在最佳和自適應控制、機器學習、人工智慧和運籌學等領域工作的研究人員和實踐者,本資源提供了實用演算法、理論分析和全面示例的結合,這些內容他們可以適應並應用於自己的工作。

訪問作者的網站 www.dcsc.tudelft.nl/rlbook/ 獲取額外資料,包括研究中使用的計算機代碼和有關新發展的信息。

作者簡介

Robert Babuska, Lucian Busoniu, and Bart de Schutter are with the Delft University of Technology. Damien Ernst is with the University of Liege.

作者簡介(中文翻譯)

羅伯特·巴布斯卡(Robert Babuska)、盧西安·布索紐(Lucian Busoniu)和巴特·德·舒特(Bart de Schutter)均隸屬於代爾夫特科技大學(Delft University of Technology)。達米安·恩斯特(Damien Ernst)則隸屬於列日大學(University of Liege)。