Approximate policy iteration for Markov decision processes via quantitative adaptive aggregations

We consider the problem of finding an optimal policy in a Markov decision process that maximises the expected discounted sum of rewards over an infinite time horizon. Since the explicit iterative dynamical programming scheme does not scale when increasing the dimension of the state space, a number o...

Full description

Bibliographic Details
Main Authors: Abate, A, Češka, M, Kwiatkowska, M
Format: Conference item
Published: Springer Verlag 2016