Copeland dueling bandits
A version of the dueling bandit problem is addressed in which a Condorcet winner may not exist. Two algorithms are proposed that instead seek to minimize regret with respect to the Copeland winner, which, unlike the Condorcet winner, is guaranteed to exist. The first, Copeland Confidence Bound (CCB)...
Main Authors: | Zoghi, M, Karnin, Z, Whiteson, S, Rijke, M |
---|---|
Format: | Conference item |
Published: |
2015
|
Similar Items
-
Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations
by: de Freitas, N, et al.
Published: (2012) -
Melancholic Mem in the Third Life of Grange Copeland
by: Sedehi, Kamelia Talebian, et al.
Published: (2015) -
OxIS 2019: Dueling perspectives on the internet in Britain
by: Blank, G, et al.
Published: (2019) -
Matching with semi-bandits
by: Kasy, M, et al.
Published: (2022) -
Batched Bandit Problems
by: Perchet, Vianney, et al.
Published: (2015)