Markov decision processes and discrete-time mean-field games constrained with costly observations
<p>In this thesis, we consider Markov decision processes with actively controlled observations. Optimal strategies involve the optimisation of observation times as well as the subsequent action values. We first consider an observation cost model, where the underlying state is observed only at...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | English |
Published: |
2023
|
Subjects: |
_version_ | 1826312299027103744 |
---|---|
author | Tam, JYY |
author2 | Reisinger, C |
author_facet | Reisinger, C Tam, JYY |
author_sort | Tam, JYY |
collection | OXFORD |
description | <p>In this thesis, we consider Markov decision processes with actively controlled observations. Optimal strategies involve the optimisation of observation times as well as the subsequent action values. We first consider an observation cost model, where the underlying state is observed only at chosen observation times at a cost. By including the time elapsed from observations as part of the augmented Markov system, the value function satisfies a system of quasi-variational inequalities (QVIs). Such a class of QVIs can be seen as an extension to the interconnected obstacle problem. We prove a comparison principle for this class of QVIs, which implies uniqueness of solutions to our proposed problem. Penalty methods are then utilised to obtain arbitrarily accurate solutions. Finally, we perform numerical experiments on three applications which illustrate this model.</p>
<p>We then consider a model where agents can exercise control actions that affect their speed of access to information. The agents can dynamically decide to receive observations with less delay by paying higher observation costs. Agents seek to exploit their active information gathering by making further decisions to influence their state dynamics to maximise rewards. We also extend this notion to a corresponding mean-field game (MFG). In the mean field equilibrium, each generic agent solves individually a partially observed Markov decision problem in which the way partial observations are obtained is itself also subject to dynamic control actions by the agent. Based on a finite characterisation of the agents’ belief states, we show how the mean field game with controlled costly information access can be formulated as an equivalent standard mean field game on a suitably augmented but finite state space. We prove that with sufficient entropy regularisation, a fixed point iteration converges to the unique MFG equilibrium and yields an approximate ε-Nash equilibrium for a large but finite population size. We illustrate our MFG by an example from epidemiology, where medical testing results at different speeds and costs can be chosen by the agents.</p> |
first_indexed | 2024-03-07T08:25:26Z |
format | Thesis |
id | oxford-uuid:ed4ce4fe-a682-4c10-bf4a-5983e69bf090 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T08:25:26Z |
publishDate | 2023 |
record_format | dspace |
spelling | oxford-uuid:ed4ce4fe-a682-4c10-bf4a-5983e69bf0902024-02-13T12:23:34ZMarkov decision processes and discrete-time mean-field games constrained with costly observationsThesishttp://purl.org/coar/resource_type/c_db06uuid:ed4ce4fe-a682-4c10-bf4a-5983e69bf090Stochastic control theoryEnglishHyrax Deposit2023Tam, JYYReisinger, C<p>In this thesis, we consider Markov decision processes with actively controlled observations. Optimal strategies involve the optimisation of observation times as well as the subsequent action values. We first consider an observation cost model, where the underlying state is observed only at chosen observation times at a cost. By including the time elapsed from observations as part of the augmented Markov system, the value function satisfies a system of quasi-variational inequalities (QVIs). Such a class of QVIs can be seen as an extension to the interconnected obstacle problem. We prove a comparison principle for this class of QVIs, which implies uniqueness of solutions to our proposed problem. Penalty methods are then utilised to obtain arbitrarily accurate solutions. Finally, we perform numerical experiments on three applications which illustrate this model.</p> <p>We then consider a model where agents can exercise control actions that affect their speed of access to information. The agents can dynamically decide to receive observations with less delay by paying higher observation costs. Agents seek to exploit their active information gathering by making further decisions to influence their state dynamics to maximise rewards. We also extend this notion to a corresponding mean-field game (MFG). In the mean field equilibrium, each generic agent solves individually a partially observed Markov decision problem in which the way partial observations are obtained is itself also subject to dynamic control actions by the agent. Based on a finite characterisation of the agents’ belief states, we show how the mean field game with controlled costly information access can be formulated as an equivalent standard mean field game on a suitably augmented but finite state space. We prove that with sufficient entropy regularisation, a fixed point iteration converges to the unique MFG equilibrium and yields an approximate ε-Nash equilibrium for a large but finite population size. We illustrate our MFG by an example from epidemiology, where medical testing results at different speeds and costs can be chosen by the agents.</p> |
spellingShingle | Stochastic control theory Tam, JYY Markov decision processes and discrete-time mean-field games constrained with costly observations |
title | Markov decision processes and discrete-time mean-field games constrained with costly observations |
title_full | Markov decision processes and discrete-time mean-field games constrained with costly observations |
title_fullStr | Markov decision processes and discrete-time mean-field games constrained with costly observations |
title_full_unstemmed | Markov decision processes and discrete-time mean-field games constrained with costly observations |
title_short | Markov decision processes and discrete-time mean-field games constrained with costly observations |
title_sort | markov decision processes and discrete time mean field games constrained with costly observations |
topic | Stochastic control theory |
work_keys_str_mv | AT tamjyy markovdecisionprocessesanddiscretetimemeanfieldgamesconstrainedwithcostlyobservations |