Transience in countable MDPs

The Transience objective is not to visit any state infinitely often. While this is not possible in any finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic. We prove the following fundamental properties of Transience in counta...

Full description

Bibliographic Details
Main Authors:	Kiefer, SM, Mayr, R, Shirmohammadi, M, Totzke, P
Format:	Conference item
Language:	English
Published:	Schloss Dagstuhl 2021

_version_	1826262571600052224
author	Kiefer, SM Mayr, R Shirmohammadi, M Totzke, P
author_facet	Kiefer, SM Mayr, R Shirmohammadi, M Totzke, P
author_sort	Kiefer, SM
collection	OXFORD
description	The Transience objective is not to visit any state infinitely often. While this is not possible in any finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic. We prove the following fundamental properties of Transience in countably infinite MDPs. 1) There exist uniformly ε-optimal MD strategies (memoryless deterministic) for Transience, even in infinitely branching MDPs. 2) Optimal strategies for Transience need not exist, even if the MDP is finitely branching. However, if an optimal strategy exists then there is also an optimal MD strategy. 3) If an MDP is universally transient (i.e., almost surely transient under all strategies) then many other objectives have a lower strategy complexity than in general MDPs. E.g., ε-optimal strategies for Safety and co-Büchi and optimal strategies for {0,1,2}-Parity (where they exist) can be chosen MD, even if the MDP is infinitely branching.
first_indexed	2024-03-06T19:38:18Z
format	Conference item
id	oxford-uuid:1fc859e7-78fa-4ab0-92b3-aa3d2ba3ebc7
institution	University of Oxford
language	English
last_indexed	2024-03-06T19:38:18Z
publishDate	2021
publisher	Schloss Dagstuhl
record_format	dspace
spelling	oxford-uuid:1fc859e7-78fa-4ab0-92b3-aa3d2ba3ebc72022-03-26T11:23:57ZTransience in countable MDPsConference itemhttp://purl.org/coar/resource_type/c_5794uuid:1fc859e7-78fa-4ab0-92b3-aa3d2ba3ebc7EnglishSymplectic ElementsSchloss Dagstuhl2021Kiefer, SMMayr, RShirmohammadi, MTotzke, PThe Transience objective is not to visit any state infinitely often. While this is not possible in any finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic. We prove the following fundamental properties of Transience in countably infinite MDPs. 1) There exist uniformly ε-optimal MD strategies (memoryless deterministic) for Transience, even in infinitely branching MDPs. 2) Optimal strategies for Transience need not exist, even if the MDP is finitely branching. However, if an optimal strategy exists then there is also an optimal MD strategy. 3) If an MDP is universally transient (i.e., almost surely transient under all strategies) then many other objectives have a lower strategy complexity than in general MDPs. E.g., ε-optimal strategies for Safety and co-Büchi and optimal strategies for {0,1,2}-Parity (where they exist) can be chosen MD, even if the MDP is infinitely branching.
spellingShingle	Kiefer, SM Mayr, R Shirmohammadi, M Totzke, P Transience in countable MDPs
title	Transience in countable MDPs
title_full	Transience in countable MDPs
title_fullStr	Transience in countable MDPs
title_full_unstemmed	Transience in countable MDPs
title_short	Transience in countable MDPs
title_sort	transience in countable mdps
work_keys_str_mv	AT kiefersm transienceincountablemdps AT mayrr transienceincountablemdps AT shirmohammadim transienceincountablemdps AT totzkep transienceincountablemdps

Transience in countable MDPs

Similar Items