Collaboratively Learning the Best Option on Graphs, Using Bounded Local Memory
<jats:p>We consider multi-armed bandit problems in social groups wherein each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (as $t\diverge$) it pulls only the arm with the highest expected rew...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Association for Computing Machinery (ACM)
2021
|
Online Access: | https://hdl.handle.net/1721.1/135402 |