Contextual bandits with cross-learning

© 2019 Neural information processing systems foundation. All rights reserved. In the classical contextual bandits problem, in each round t, a learner observes some context c, chooses some action a to perform, and receives some reward ra,t(c). We consider the variant of this problem where in addition...

Full description

Bibliographic Details
Format: Article
Language:English
Published: 2021
Online Access:https://hdl.handle.net/1721.1/137415

Similar Items