Decentralized cooperative stochastic bandits

We study a decentralized cooperative stochastic multi-armed bandit problem with K arms on a network of N agents. In our model, the reward distribution of each arm is the same for each agent and rewards are drawn independently across agents and time steps. In each round, each agent chooses an arm to...

Full description

Bibliographic Details
Main Authors: Martínez-Rubio, D, Kanade, V, Rebeschini, P
Format: Conference item
Language:English
Published: Neural Information Processing Systems Foundation 2019