Gittins' theorem under uncertainty

We study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optim...

Celý popis

Podrobná bibliografie
Hlavní autoři: Cohen, SN, Treetanthiploet, T
Médium: Journal article
Jazyk:English
Vydáno: Institute of Mathematical Statistics and Bernoulli Society 2022
_version_ 1826294388704149504
author Cohen, SN
Treetanthiploet, T
author_facet Cohen, SN
Treetanthiploet, T
author_sort Cohen, SN
collection OXFORD
description We study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optimal choices. This involves studying the interaction of our uncertainty with controls which determine the filtration. We also run a simple numerical example which illustrates the interaction between the willingness to explore and uncertainty aversion of the agent when making decisions.
first_indexed 2024-03-07T03:44:52Z
format Journal article
id oxford-uuid:bf200b78-bb8e-410b-924d-fdccf3898a49
institution University of Oxford
language English
last_indexed 2024-03-07T03:44:52Z
publishDate 2022
publisher Institute of Mathematical Statistics and Bernoulli Society
record_format dspace
spelling oxford-uuid:bf200b78-bb8e-410b-924d-fdccf3898a492022-03-27T05:45:04ZGittins' theorem under uncertaintyJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:bf200b78-bb8e-410b-924d-fdccf3898a49EnglishSymplectic ElementsInstitute of Mathematical Statistics and Bernoulli Society2022Cohen, SNTreetanthiploet, TWe study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optimal choices. This involves studying the interaction of our uncertainty with controls which determine the filtration. We also run a simple numerical example which illustrates the interaction between the willingness to explore and uncertainty aversion of the agent when making decisions.
spellingShingle Cohen, SN
Treetanthiploet, T
Gittins' theorem under uncertainty
title Gittins' theorem under uncertainty
title_full Gittins' theorem under uncertainty
title_fullStr Gittins' theorem under uncertainty
title_full_unstemmed Gittins' theorem under uncertainty
title_short Gittins' theorem under uncertainty
title_sort gittins theorem under uncertainty
work_keys_str_mv AT cohensn gittinstheoremunderuncertainty
AT treetanthiploett gittinstheoremunderuncertainty