Bounds for Markov Decision Processes
We consider the problem of producing lower bounds on the optimal cost-to-go function of a Markov decision problem. We present two approaches to this problem: one based on the methodology of approximate linear programming (ALP) and another based on the so-called martingale duality approach. We show t...
Main Authors: | Desai, Vijay V., Farias, Vivek F., Moallemi, Ciamac C. |
---|---|
Other Authors: | Sloan School of Management |
Format: | Article |
Published: |
John Wiley & Sons, Inc.
2019
|
Online Access: | http://hdl.handle.net/1721.1/120518 https://orcid.org/0000-0002-5856-9246 |
Similar Items
-
Approximate Dynamic Programming via a Smoothed Linear Program
by: Desai, Vijay V., et al.
Published: (2012) -
Non-Parametric Approximate Dynamic Programming via the Kernel Method
by: Bhat, Nikhil, et al.
Published: (2014) -
Near-Optimal A-B Testing
by: Bhat, Nikhil, et al.
Published: (2021) -
Universal Reinforcement Learning
by: Farias, Vivek F., et al.
Published: (2010) -
Learning bounded optimal behavior using Markov decision processes
by: Vuong, Hon Fai, 1975-
Published: (2009)