Gittins' theorem under uncertainty
We study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optim...
Những tác giả chính: | , |
---|---|
Định dạng: | Journal article |
Ngôn ngữ: | English |
Được phát hành: |
Institute of Mathematical Statistics and Bernoulli Society
2022
|
_version_ | 1826294388704149504 |
---|---|
author | Cohen, SN Treetanthiploet, T |
author_facet | Cohen, SN Treetanthiploet, T |
author_sort | Cohen, SN |
collection | OXFORD |
description | We study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optimal choices. This involves studying the interaction of our uncertainty with controls which determine the filtration. We also run a simple numerical example which illustrates the interaction between the willingness to explore and uncertainty aversion of the agent when making decisions. |
first_indexed | 2024-03-07T03:44:52Z |
format | Journal article |
id | oxford-uuid:bf200b78-bb8e-410b-924d-fdccf3898a49 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T03:44:52Z |
publishDate | 2022 |
publisher | Institute of Mathematical Statistics and Bernoulli Society |
record_format | dspace |
spelling | oxford-uuid:bf200b78-bb8e-410b-924d-fdccf3898a492022-03-27T05:45:04ZGittins' theorem under uncertaintyJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:bf200b78-bb8e-410b-924d-fdccf3898a49EnglishSymplectic ElementsInstitute of Mathematical Statistics and Bernoulli Society2022Cohen, SNTreetanthiploet, TWe study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optimal choices. This involves studying the interaction of our uncertainty with controls which determine the filtration. We also run a simple numerical example which illustrates the interaction between the willingness to explore and uncertainty aversion of the agent when making decisions. |
spellingShingle | Cohen, SN Treetanthiploet, T Gittins' theorem under uncertainty |
title | Gittins' theorem under uncertainty |
title_full | Gittins' theorem under uncertainty |
title_fullStr | Gittins' theorem under uncertainty |
title_full_unstemmed | Gittins' theorem under uncertainty |
title_short | Gittins' theorem under uncertainty |
title_sort | gittins theorem under uncertainty |
work_keys_str_mv | AT cohensn gittinstheoremunderuncertainty AT treetanthiploett gittinstheoremunderuncertainty |