Pathologies of Temporal Difference Methods in Approximate Dynamic Programming

Pathologies of Temporal Difference Methods in Approximate Dynamic Programming

Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated convergence behavior is complex, and not well understood at present. An important question is whether the policy iterati...

Full description

Bibliographic Details
Main Author:	Bertsekas, Dimitri P.
Other Authors:	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format:	Article
Language:	en_US
Published:	Institute of Electrical and Electronics Engineers 2011
Online Access:	http://hdl.handle.net/1721.1/64641 https://orcid.org/0000-0001-6909-7208

Similar Items

A unified framework for temporal difference methods
by: Bertsekas, Dimitri P.
Published: (2010)

Approximate policy iteration: A survey and some new methods
by: Bertsekas, Dimitri P.
Published: (2012)

Proximal algorithms and temporal difference methods for solving fixed point problems
by: Bertsekas, Dimitri P
Published: (2021)

Convergence Results for Some Temporal Difference Methods Based on Least Squares
by: Yu, Huizhen, et al.
Published: (2012)

Basis Function Adaptation Methods for Cost Approximation in MDP
by: Yu, Huizhen, et al.
Published: (2010)

Regular Policies in Abstract Dynamic Programming
by: Bertsekas, Dimitri P
Published: (2018)

Dynamic programming : deterministic and stochastic models /
by: 196299 Bertsekas, Dimitri P.
Published: (1987)

6.231 Dynamic Programming and Stochastic Control, Fall 2011
by: Bertsekas, Dimitri
Published: (2011)

6.231 Dynamic Programming and Stochastic Control, Fall 2002
by: Bertsekas, Dimitri P.
Published: (2002)

6.231 Dynamic Programming and Stochastic Control, Fall 2008
by: Bertsekas, Dimitri
Published: (2008)

Multiagent value iteration algorithms in dynamic programming and reinforcement learning
by: Dimitri Bertsekas
Published: (2020-12-01)

Distributed Asynchronous Policy Iteration in Dynamic Programming
by: Bertsekas, Dimitri P., et al.
Published: (2011)

A Unifying Polyhedral Approximation Framework for Convex Optimization
by: Bertsekas, Dimitri P., et al.
Published: (2011)

Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
by: Bertsekas, Dimitri P, et al.
Published: (2019)

Incremental proximal methods for large scale convex optimization
by: Bertsekas, Dimitri P.
Published: (2012)

Dynamic and stochastic control
by: 196299 Bertsekas, Dimitri P.
Published: (1976)

Newton’s method for reinforcement learning and model predictive control
by: Dimitri Bertsekas
Published: (2022-06-01)

Incremental constraint projection methods for variational inequalities
by: Wang, Mengdi, et al.
Published: (2015)

Stochastic First-Order Methods with Random Constraint Projection
by: Wang, Mengdi, et al.
Published: (2016)

Stabilization of Stochastic Iterative Methods for Singular and Nearly Singular Linear Systems
by: Wang, Mengdi, et al.
Published: (2015)

Parallel and Distributed Computation:Numerical Methods
by: Bertsekas, Dimitri P., et al.
Published: (2003)

On the convergence of simulation-based iterative methods for solving singular linear systems
by: Mengdi Wang, et al.
Published: (2013-01-01)

Control of uncertain systems with a set-membership description of the uncertainty.
by: Bertsekas, Dimitri P
Published: (2005)

New auction algorithms for the assignment problem and extensions
by: Dimitri Bertsekas
Published: (2024-03-01)

6.253 Convex Analysis and Optimization, Spring 2010
by: Bertsekas, Dimitri
Published: (2010)

6.253 Convex Analysis and Optimization, Spring 2004
by: Bertsekas, Dimitri
Published: (2004)

Parallel and distributed computation : numerical methods /
by: 196299 Bertsekas, Dimitri P., et al.
Published: (1989)

Non-Parametric Approximate Dynamic Programming via the Kernel Method
by: Bhat, Nikhil, et al.
Published: (2014)

On the solution of some minimax problems /
by: 196299 Bertsekas, Dimitri P.

Q-learning and policy iteration algorithms for stochastic shortest path problems
by: Yu, Huizhen, et al.
Published: (2015)

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems
by: Yu, Huizhen, et al.
Published: (2015)

An analysis of temporal-difference learning with function approximation
Published: (2003)

Air-Combat Strategy Using Approximate Dynamic Programming
by: McGrew, James S., et al.
Published: (2011)

Approximate Dynamic Programming via a Smoothed Linear Program
by: Desai, Vijay V., et al.
Published: (2012)

Projected equation and aggregation-based approximate dynamic programming methods for Tetris
by: Hwang, Daw-sen
Published: (2011)

A linear programming methodology for approximate dynamic programming
by: Díaz Henry, et al.
Published: (2020-06-01)

An approximate dynamic programming method for unit-based small hydropower scheduling
by: Yueyang Ji, et al.
Published: (2022-07-01)

An approximate dynamic programming approach to solving dynamic oligopoly models
by: Farias, Vivek F., et al.
Published: (2012)

An approximate dynamic programming approach for designing train timetables
by: Pena Alcaraz, Maite, et al.
Published: (2016)

Approximate dynamic programming : solving the curses of dimensionality /
by: Powell, Warren B. 1955-
Published: (2007)