Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
A fundamental challenge in contextual bandits is to develop flexible, general-purpose algorithms with computational requirements no worse than classical supervised learning tasks such as classification and regression. Algorithms based on regression have shown promising empirical success, but theoret...
Príomhchruthaitheoirí: | Foster, Dylan J, Rakhlin, Alexander |
---|---|
Rannpháirtithe: | Statistics and Data Science Center (Massachusetts Institute of Technology) |
Formáid: | Alt |
Teanga: | English |
Foilsithe / Cruthaithe: |
2021
|
Rochtain ar líne: | https://hdl.handle.net/1721.1/138306 |
Míreanna comhchosúla
Míreanna comhchosúla
-
Nonstationary Stochastic Bandits: UCB Policies and Minimax Regret
de réir: Lai Wei, et al.
Foilsithe / Cruthaithe: (2024-01-01) -
Comparative Evaluation of Mean Cumulative Regret in Multi-Armed Bandit Algorithms: ETC, UCB, Asymptotically Optimal UCB, and TS
de réir: Lei Yicong
Foilsithe / Cruthaithe: (2025-01-01) -
Top-k eXtreme Contextual Bandits with Arm Hierarchy
de réir: Sen, Rajat, et al.
Foilsithe / Cruthaithe: (2023) -
Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination
de réir: Chen, Sitan, et al.
Foilsithe / Cruthaithe: (2022) -
Contextual bandits with cross-learning
Foilsithe / Cruthaithe: (2021)