Pathologies of Temporal Difference Methods in Approximate Dynamic Programming
Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated convergence behavior is complex, and not well understood at present. An important question is whether the policy iterati...
Main Author: | Bertsekas, Dimitri P. |
---|---|
Other Authors: | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
Format: | Article |
Language: | en_US |
Published: |
Institute of Electrical and Electronics Engineers
2011
|
Online Access: | http://hdl.handle.net/1721.1/64641 https://orcid.org/0000-0001-6909-7208 |
Similar Items
-
A unified framework for temporal difference methods
by: Bertsekas, Dimitri P.
Published: (2010) -
Approximate policy iteration: A survey and some new methods
by: Bertsekas, Dimitri P.
Published: (2012) -
Proximal algorithms and temporal difference methods for solving fixed point problems
by: Bertsekas, Dimitri P
Published: (2021) -
Convergence Results for Some Temporal Difference Methods Based on Least Squares
by: Yu, Huizhen, et al.
Published: (2012) -
Basis Function Adaptation Methods for Cost Approximation in MDP
by: Yu, Huizhen, et al.
Published: (2010)