Optimistic gittins indices

Starting with the Thomspon sampling algorithm, recent years have seen a resurgence of interest in Bayesian algorithms for the Multi-armed Bandit (MAB) problem. These algorithms seek to exploit prior information on arm biases and while several have been shown to be regret optimal, their design has no...

Full description

Bibliographic Details
Main Authors: Gutin, Eli, Farias, Vivek F.
Other Authors: Sloan School of Management
Format: Article
Published: NIPS Foundation 2020
Online Access:https://hdl.handle.net/1721.1/128464