Loss bounds for uncertain transition probabilities in Markov decision processes
We analyze losses resulting from uncertain transition probabilities in Markov decision processes with bounded nonnegative rewards. We assume that policies are precomputed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Institute of Electrical and Electronics Engineers (IEEE)
2014
|
Online Access: | http://hdl.handle.net/1721.1/86896 https://orcid.org/0000-0002-8585-6566 |
_version_ | 1811088350611767296 |
---|---|
author | Jaillet, Patrick Mastin, Dana Andrew |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Jaillet, Patrick Mastin, Dana Andrew |
author_sort | Jaillet, Patrick |
collection | MIT |
description | We analyze losses resulting from uncertain transition probabilities in Markov decision processes with bounded nonnegative rewards. We assume that policies are precomputed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true transition probabilities. Given a bound on the total variation error of estimated transition probability distributions, we derive upper bounds on the loss of expected total reward. The approach analyzes the growth of errors incurred by stepping backwards in time while precomputing value functions, which requires bounding a multilinear program. Loss bounds are given for the finite horizon undiscounted, finite horizon discounted, and infinite horizon discounted cases, and a tight example is shown. |
first_indexed | 2024-09-23T14:00:50Z |
format | Article |
id | mit-1721.1/86896 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T14:00:50Z |
publishDate | 2014 |
publisher | Institute of Electrical and Electronics Engineers (IEEE) |
record_format | dspace |
spelling | mit-1721.1/868962022-09-28T17:43:02Z Loss bounds for uncertain transition probabilities in Markov decision processes Jaillet, Patrick Mastin, Dana Andrew Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Mastin, Dana Andrew Jaillet, Patrick We analyze losses resulting from uncertain transition probabilities in Markov decision processes with bounded nonnegative rewards. We assume that policies are precomputed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true transition probabilities. Given a bound on the total variation error of estimated transition probability distributions, we derive upper bounds on the loss of expected total reward. The approach analyzes the growth of errors incurred by stepping backwards in time while precomputing value functions, which requires bounding a multilinear program. Loss bounds are given for the finite horizon undiscounted, finite horizon discounted, and infinite horizon discounted cases, and a tight example is shown. National Science Foundation (U.S.) (Grant 1029603) National Science Foundation (U.S.). Graduate Research Fellowship Program 2014-05-09T14:23:59Z 2014-05-09T14:23:59Z 2012-12 Article http://purl.org/eprint/type/ConferencePaper 978-1-4673-2066-5 978-1-4673-2065-8 978-1-4673-2063-4 978-1-4673-2064-1 http://hdl.handle.net/1721.1/86896 Mastin, Andrew, and Patrick Jaillet. “Loss Bounds for Uncertain Transition Probabilities in Markov Decision Processes.” 2012 IEEE 51st IEEE Conference on Decision and Control (CDC) (n.d.). https://orcid.org/0000-0002-8585-6566 en_US http://dx.doi.org/10.1109/CDC.2012.6426504 Proceedings of the 2012 IEEE 51st IEEE Conference on Decision and Control (CDC) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) MIT web domain |
spellingShingle | Jaillet, Patrick Mastin, Dana Andrew Loss bounds for uncertain transition probabilities in Markov decision processes |
title | Loss bounds for uncertain transition probabilities in Markov decision processes |
title_full | Loss bounds for uncertain transition probabilities in Markov decision processes |
title_fullStr | Loss bounds for uncertain transition probabilities in Markov decision processes |
title_full_unstemmed | Loss bounds for uncertain transition probabilities in Markov decision processes |
title_short | Loss bounds for uncertain transition probabilities in Markov decision processes |
title_sort | loss bounds for uncertain transition probabilities in markov decision processes |
url | http://hdl.handle.net/1721.1/86896 https://orcid.org/0000-0002-8585-6566 |
work_keys_str_mv | AT jailletpatrick lossboundsforuncertaintransitionprobabilitiesinmarkovdecisionprocesses AT mastindanaandrew lossboundsforuncertaintransitionprobabilitiesinmarkovdecisionprocesses |