Learning robust policies for uncertain parametric Markov decision processes

Synthesising verifiably correct controllers for dynamical systems is crucial for safety-critical problems. To achieve this, it is important to account for uncertainty in a robust manner, while at the same time it is often of interest to avoid being overly conservative with the view of achieving a be...

Full description

Bibliographic Details
Main Authors: Rickard, L, Abate, A, Margellos, K
Format: Conference item
Language:English
Published: Journal of Machine Learning Research 2024
_version_ 1826313744283598848
author Rickard, L
Abate, A
Margellos, K
author_facet Rickard, L
Abate, A
Margellos, K
author_sort Rickard, L
collection OXFORD
description Synthesising verifiably correct controllers for dynamical systems is crucial for safety-critical problems. To achieve this, it is important to account for uncertainty in a robust manner, while at the same time it is often of interest to avoid being overly conservative with the view of achieving a better cost. We propose a method for verifiably safe policy synthesis for a class of finite state models, under the presence of structural uncertainty. In particular, we consider uncertain parametric Markov decision processes (upMDPs), a special class of Markov decision processes, with parameterised transition functions, where such parameters are drawn from a (potentially) unknown distribution. Our framework leverages recent advancements in the so-called scenario approach theory, where we represent the uncertainty by means of scenarios, and provide guarantees on synthesised policies satisfying probabilistic computation tree logic (PCTL) formulae. We consider several common benchmarks/problems and compare our work to recent developments for verifying upMDPs.
first_indexed 2024-09-25T04:19:44Z
format Conference item
id oxford-uuid:f92e0cff-c024-4f20-85dc-95e83dfceb14
institution University of Oxford
language English
last_indexed 2024-09-25T04:19:44Z
publishDate 2024
publisher Journal of Machine Learning Research
record_format dspace
spelling oxford-uuid:f92e0cff-c024-4f20-85dc-95e83dfceb142024-08-01T12:22:45ZLearning robust policies for uncertain parametric Markov decision processesConference itemhttp://purl.org/coar/resource_type/c_5794uuid:f92e0cff-c024-4f20-85dc-95e83dfceb14EnglishSymplectic ElementsJournal of Machine Learning Research2024Rickard, LAbate, AMargellos, KSynthesising verifiably correct controllers for dynamical systems is crucial for safety-critical problems. To achieve this, it is important to account for uncertainty in a robust manner, while at the same time it is often of interest to avoid being overly conservative with the view of achieving a better cost. We propose a method for verifiably safe policy synthesis for a class of finite state models, under the presence of structural uncertainty. In particular, we consider uncertain parametric Markov decision processes (upMDPs), a special class of Markov decision processes, with parameterised transition functions, where such parameters are drawn from a (potentially) unknown distribution. Our framework leverages recent advancements in the so-called scenario approach theory, where we represent the uncertainty by means of scenarios, and provide guarantees on synthesised policies satisfying probabilistic computation tree logic (PCTL) formulae. We consider several common benchmarks/problems and compare our work to recent developments for verifying upMDPs.
spellingShingle Rickard, L
Abate, A
Margellos, K
Learning robust policies for uncertain parametric Markov decision processes
title Learning robust policies for uncertain parametric Markov decision processes
title_full Learning robust policies for uncertain parametric Markov decision processes
title_fullStr Learning robust policies for uncertain parametric Markov decision processes
title_full_unstemmed Learning robust policies for uncertain parametric Markov decision processes
title_short Learning robust policies for uncertain parametric Markov decision processes
title_sort learning robust policies for uncertain parametric markov decision processes
work_keys_str_mv AT rickardl learningrobustpoliciesforuncertainparametricmarkovdecisionprocesses
AT abatea learningrobustpoliciesforuncertainparametricmarkovdecisionprocesses
AT margellosk learningrobustpoliciesforuncertainparametricmarkovdecisionprocesses