Contextual bandits with cross-learning
© 2019 Neural information processing systems foundation. All rights reserved. In the classical contextual bandits problem, in each round t, a learner observes some context c, chooses some action a to perform, and receives some reward ra,t(c). We consider the variant of this problem where in addition...
Format: | Article |
---|---|
Language: | English |
Published: |
2021
|
Online Access: | https://hdl.handle.net/1721.1/137415 |
Similar Items
-
Contextual bandits with cross-learning
by: Balseiro, Santiago, et al.
Published: (2021) -
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
by: Foster, Dylan J, et al.
Published: (2021) -
Top-k eXtreme Contextual Bandits with Arm Hierarchy
by: Sen, Rajat, et al.
Published: (2023) -
Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination
by: Chen, Sitan, et al.
Published: (2022) -
Undiscounted bandit games
by: Keller, G, et al.
Published: (2020)