Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
A fundamental challenge in contextual bandits is to develop flexible, general-purpose algorithms with computational requirements no worse than classical supervised learning tasks such as classification and regression. Algorithms based on regression have shown promising empirical success, but theoret...
Main Authors: | Foster, Dylan J, Rakhlin, Alexander |
---|---|
Other Authors: | Statistics and Data Science Center (Massachusetts Institute of Technology) |
Format: | Article |
Language: | English |
Published: |
2021
|
Online Access: | https://hdl.handle.net/1721.1/138306 |
Similar Items
-
Nonstationary Stochastic Bandits: UCB Policies and Minimax Regret
by: Lai Wei, et al.
Published: (2024-01-01) -
Top-k eXtreme Contextual Bandits with Arm Hierarchy
by: Sen, Rajat, et al.
Published: (2023) -
Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination
by: Chen, Sitan, et al.
Published: (2022) -
Contextual bandits with cross-learning
Published: (2021) -
Contextual bandits with cross-learning
by: Balseiro, Santiago, et al.
Published: (2021)