Copeland dueling bandits
A version of the dueling bandit problem is addressed in which a Condorcet winner may not exist. Two algorithms are proposed that instead seek to minimize regret with respect to the Copeland winner, which, unlike the Condorcet winner, is guaranteed to exist. The first, Copeland Confidence Bound (CCB)...
主要な著者: | Zoghi, M, Karnin, Z, Whiteson, S, Rijke, M |
---|---|
フォーマット: | Conference item |
出版事項: |
2015
|
類似資料
-
Melancholic Mem in the Third Life of Grange Copeland
著者:: Sedehi, Kamelia Talebian, 等
出版事項: (2015) -
Good Outcome Following Copeland Hemiarthroplasty for Acromegalic Arthropathy
著者:: S. E. Johnson-Lynn, 等
出版事項: (2011-01-01) -
Synergy in science: an interview with Neal Copeland and Nancy Jenkins
出版事項: (2012-11-01) -
Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations
著者:: de Freitas, N, 等
出版事項: (2012) -
StreamingBandit: Experimenting with Bandit Policies
著者:: Jules Kruijswijk, 等
出版事項: (2020-08-01)