Least Squares Temporal Difference Methods: An Analysis under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) with the least squares temporal difference (LSTD) algorithm, LSTD($\lambda$), in an exploration-enhanced learning context, where policy costs are computed from observations of a Markov chain differe...
Main Author: | Yu, Huizhen |
---|---|
Other Authors: | Massachusetts Institute of Technology. Laboratory for Information and Decision Systems |
Format: | Article |
Language: | en_US |
Published: |
Society for Industrial and Applied Mathematics
2013
|
Online Access: | http://hdl.handle.net/1721.1/77629 |
Similar Items
-
Convergence Results for Some Temporal Difference Methods Based on Least Squares
by: Yu, Huizhen, et al.
Published: (2012) -
Least Squares Shadowing Method for Sensitivity Analysis of Differential Equations
by: Chater, Mario, et al.
Published: (2018) -
Available Transfer Capability and Least Square Method
by: Hojabri, Mojgan, et al.
Published: (2012) -
Available transfer capability and least square method
by: Hojabri, Mojgan, et al.
Published: (2012) -
Robust feasible generalized least squares: A remedial measures of heteroscedasticity
by: Rana, Sohel, et al.
Published: (2015)