Markov decision processes and discrete-time mean-field games constrained with costly observations

<p>In this thesis, we consider Markov decision processes with actively controlled observations. Optimal strategies involve the optimisation of observation times as well as the subsequent action values. We first consider an observation cost model, where the underlying state is observed only at...

Full description

Bibliographic Details
Main Author: Tam, JYY
Other Authors: Reisinger, C
Format: Thesis
Language:English
Published: 2023
Subjects:
_version_ 1826312299027103744
author Tam, JYY
author2 Reisinger, C
author_facet Reisinger, C
Tam, JYY
author_sort Tam, JYY
collection OXFORD
description <p>In this thesis, we consider Markov decision processes with actively controlled observations. Optimal strategies involve the optimisation of observation times as well as the subsequent action values. We first consider an observation cost model, where the underlying state is observed only at chosen observation times at a cost. By including the time elapsed from observations as part of the augmented Markov system, the value function satisfies a system of quasi-variational inequalities (QVIs). Such a class of QVIs can be seen as an extension to the interconnected obstacle problem. We prove a comparison principle for this class of QVIs, which implies uniqueness of solutions to our proposed problem. Penalty methods are then utilised to obtain arbitrarily accurate solutions. Finally, we perform numerical experiments on three applications which illustrate this model.</p> <p>We then consider a model where agents can exercise control actions that affect their speed of access to information. The agents can dynamically decide to receive observations with less delay by paying higher observation costs. Agents seek to exploit their active information gathering by making further decisions to influence their state dynamics to maximise rewards. We also extend this notion to a corresponding mean-field game (MFG). In the mean field equilibrium, each generic agent solves individually a partially observed Markov decision problem in which the way partial observations are obtained is itself also subject to dynamic control actions by the agent. Based on a finite characterisation of the agents’ belief states, we show how the mean field game with controlled costly information access can be formulated as an equivalent standard mean field game on a suitably augmented but finite state space. We prove that with sufficient entropy regularisation, a fixed point iteration converges to the unique MFG equilibrium and yields an approximate ε-Nash equilibrium for a large but finite population size. We illustrate our MFG by an example from epidemiology, where medical testing results at different speeds and costs can be chosen by the agents.</p>
first_indexed 2024-03-07T08:25:26Z
format Thesis
id oxford-uuid:ed4ce4fe-a682-4c10-bf4a-5983e69bf090
institution University of Oxford
language English
last_indexed 2024-03-07T08:25:26Z
publishDate 2023
record_format dspace
spelling oxford-uuid:ed4ce4fe-a682-4c10-bf4a-5983e69bf0902024-02-13T12:23:34ZMarkov decision processes and discrete-time mean-field games constrained with costly observationsThesishttp://purl.org/coar/resource_type/c_db06uuid:ed4ce4fe-a682-4c10-bf4a-5983e69bf090Stochastic control theoryEnglishHyrax Deposit2023Tam, JYYReisinger, C<p>In this thesis, we consider Markov decision processes with actively controlled observations. Optimal strategies involve the optimisation of observation times as well as the subsequent action values. We first consider an observation cost model, where the underlying state is observed only at chosen observation times at a cost. By including the time elapsed from observations as part of the augmented Markov system, the value function satisfies a system of quasi-variational inequalities (QVIs). Such a class of QVIs can be seen as an extension to the interconnected obstacle problem. We prove a comparison principle for this class of QVIs, which implies uniqueness of solutions to our proposed problem. Penalty methods are then utilised to obtain arbitrarily accurate solutions. Finally, we perform numerical experiments on three applications which illustrate this model.</p> <p>We then consider a model where agents can exercise control actions that affect their speed of access to information. The agents can dynamically decide to receive observations with less delay by paying higher observation costs. Agents seek to exploit their active information gathering by making further decisions to influence their state dynamics to maximise rewards. We also extend this notion to a corresponding mean-field game (MFG). In the mean field equilibrium, each generic agent solves individually a partially observed Markov decision problem in which the way partial observations are obtained is itself also subject to dynamic control actions by the agent. Based on a finite characterisation of the agents’ belief states, we show how the mean field game with controlled costly information access can be formulated as an equivalent standard mean field game on a suitably augmented but finite state space. We prove that with sufficient entropy regularisation, a fixed point iteration converges to the unique MFG equilibrium and yields an approximate ε-Nash equilibrium for a large but finite population size. We illustrate our MFG by an example from epidemiology, where medical testing results at different speeds and costs can be chosen by the agents.</p>
spellingShingle Stochastic control theory
Tam, JYY
Markov decision processes and discrete-time mean-field games constrained with costly observations
title Markov decision processes and discrete-time mean-field games constrained with costly observations
title_full Markov decision processes and discrete-time mean-field games constrained with costly observations
title_fullStr Markov decision processes and discrete-time mean-field games constrained with costly observations
title_full_unstemmed Markov decision processes and discrete-time mean-field games constrained with costly observations
title_short Markov decision processes and discrete-time mean-field games constrained with costly observations
title_sort markov decision processes and discrete time mean field games constrained with costly observations
topic Stochastic control theory
work_keys_str_mv AT tamjyy markovdecisionprocessesanddiscretetimemeanfieldgamesconstrainedwithcostlyobservations