Approximate policy iteration: A survey and some new methods
We consider the classical policy iteration method of dynamic programming (DP), where approximations and simulation are used to deal with the curse of dimensionality. We survey a number of issues: convergence and rate of convergence of approximate policy evaluation methods, singularity and susceptibi...
Main Author: | Bertsekas, Dimitri P. |
---|---|
Other Authors: | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
Format: | Article |
Language: | en_US |
Published: |
Springer-Verlag
2012
|
Online Access: | http://hdl.handle.net/1721.1/73485 https://orcid.org/0000-0001-6909-7208 |
Similar Items
-
Distributed Asynchronous Policy Iteration in Dynamic Programming
by: Bertsekas, Dimitri P., et al.
Published: (2011) -
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
by: Bertsekas, Dimitri P, et al.
Published: (2019) -
Q-learning and policy iteration algorithms for stochastic shortest path problems
by: Yu, Huizhen, et al.
Published: (2015) -
Pathologies of Temporal Difference Methods in Approximate Dynamic Programming
by: Bertsekas, Dimitri P.
Published: (2011) -
Stabilization of Stochastic Iterative Methods for Singular and Nearly Singular Linear Systems
by: Wang, Mengdi, et al.
Published: (2015)