Loss bounds for uncertain transition probabilities in Markov decision processes
We analyze losses resulting from uncertain transition probabilities in Markov decision processes with bounded nonnegative rewards. We assume that policies are precomputed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Institute of Electrical and Electronics Engineers (IEEE)
2014
|
Online Access: | http://hdl.handle.net/1721.1/86896 https://orcid.org/0000-0002-8585-6566 |