Batched Bandit Problems

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives close to minimax optimal regr...

Full description

Bibliographic Details
Main Authors: Perchet, Vianney, Rigollet, Philippe, Chassang, Sylvain, Snowberg, Erik
Other Authors: Massachusetts Institute of Technology. Department of Mathematics
Format: Article
Language:en_US
Published: Institute of Mathematical Statistics 2015
Online Access:http://hdl.handle.net/1721.1/98879
https://orcid.org/0000-0002-0135-7162