Batched Bandit Problems
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives close to minimax optimal regr...
Main Authors: | Perchet, Vianney, Rigollet, Philippe, Chassang, Sylvain, Snowberg, Erik |
---|---|
Other Authors: | Massachusetts Institute of Technology. Department of Mathematics |
Format: | Article |
Language: | en_US |
Published: |
Institute of Mathematical Statistics
2015
|
Online Access: | http://hdl.handle.net/1721.1/98879 https://orcid.org/0000-0002-0135-7162 |
Similar Items
-
Online learning in repeated auctions
by: Weed, Jonathan, et al.
Published: (2022) -
An Algorithmic Solution to the Blotto Game using Multi-marginal Couplings
by: Perchet, Vianney, et al.
Published: (2022) -
A Theory of Experimenters: Robustness, Randomization, and Balance
by: Banerjee, Abhijit V., et al.
Published: (2022) -
A Theory of Experimenters: Robustness, Randomization, and Balance
by: Banerjee, Abhijit V, et al.
Published: (2021) -
Bandit Problems under Censored Feedback
by: Guinet, Gauthier Marc Benoit
Published: (2023)