Collaboratively Learning the Best Option on Graphs, Using Bounded Local Memory
<jats:p>We consider multi-armed bandit problems in social groups wherein each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (as $t\diverge$) it pulls only the arm with the highest expected rew...
Главные авторы: | , , |
---|---|
Другие авторы: | |
Формат: | Статья |
Язык: | English |
Опубликовано: |
Association for Computing Machinery (ACM)
2021
|
Online-ссылка: | https://hdl.handle.net/1721.1/135402 |