Output-weighted sampling for multi-armed bandits with extreme payoffs
We present a new type of acquisition function for online decision-making in multi-armed and contextual bandit problems with extreme payoffs. Specifically, we model the payoff function as a Gaussian process and formulate a novel type of upper confidence bound acquisition function that guides explorat...
Main Authors: | Yang, Yibo, Blanchard, Antoine, Sapsis, Themistoklis, Perdikaris, Paris |
---|---|
Format: | Article |
Language: | English |
Published: |
The Royal Society
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/1721.1/154219 |
Similar Items
-
Optimal criteria and their asymptotic form for data selection in data-driven reduced-order modelling with Gaussian process regression
by: Sapsis, Themistoklis P., et al.
Published: (2024) -
Bayesian optimization with output-weighted optimal sampling
by: Blanchard, Antoine, et al.
Published: (2022) -
Output-Weighted Optimal Sampling for Bayesian Experimental Design and Uncertainty Quantification
by: Blanchard, Antoine, et al.
Published: (2022) -
Representation theoretic interpretation and interpolation properties of inhomogeneous spin q-Whittaker polynomials
by: Korotkikh, Sergei
Published: (2024) -
Bayesian optimization with output-weighted optimal sampling
by: Blanchard, Antoine Bertrand Emile, et al.
Published: (2022)