From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces

STOC ’24, June 24–28, 2024, Vancouver, BC, Canada

Bibliographic Details
Main Authors: Dagan, Yuval, Daskalakis, Constantinos, Fishelson, Maxwell, Golowich, Noah
Format: Article
Language:English
Published: ACM 2024
Online Access:https://hdl.handle.net/1721.1/155581
_version_ 1811086527032197120
author Dagan, Yuval
Daskalakis, Constantinos
Fishelson, Maxwell
Golowich, Noah
author_facet Dagan, Yuval
Daskalakis, Constantinos
Fishelson, Maxwell
Golowich, Noah
author_sort Dagan, Yuval
collection MIT
description STOC ’24, June 24–28, 2024, Vancouver, BC, Canada
first_indexed 2024-09-23T13:27:20Z
format Article
id mit-1721.1/155581
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T13:27:20Z
publishDate 2024
publisher ACM
record_format dspace
spelling mit-1721.1/1555812024-09-09T04:42:26Z From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces Dagan, Yuval Daskalakis, Constantinos Fishelson, Maxwell Golowich, Noah STOC ’24, June 24–28, 2024, Vancouver, BC, Canada We provide a novel reduction from swap-regret minimization to external-regret minimization, which improves upon the classical reductions of Blum-Mansour and Stoltz-Lugosi in that it does not require finiteness of the space of actions. We show that, whenever there exists a no-external-regret algorithm for some hypothesis class, there must also exist a no-swap-regret algorithm for that same class. For the problem of learning with expert advice, our result implies that it is possible to guarantee that the swap regret is bounded by є after (logN)Õ(1/є) rounds and with O(N) per iteration complexity, where N is the number of experts, while the classical reductions of Blum-Mansour and Stoltz-Lugosi require at least Ω(N/є2) rounds and at least Ω(N3) total computational cost. Our result comes with an associated lower bound, which—in contrast to that of Blum-Mansour—holds for oblivious and ℓ1-constrained adversaries and learners that can employ distributions over experts, showing that the number of rounds must be Ω(N/є2) or exponential in 1/є. Our reduction implies that, if no-regret learning is possible in some game, then this game must have approximate correlated equilibria, of arbitrarily good approximation. This strengthens the folklore implication of no-regret learning that approximate coarse correlated equilibria exist. Importantly, it provides a sufficient condition for the existence of approximate correlated equilibrium which vastly extends the requirement that the action set is finite or the requirement that the action set is compact and the utility functions are continuous, allowing for games with finite Littlestone or finite sequential fat shattering dimension, thus answering a question left open in “Fast rates for nonparametric online learning: from realizability to learning in games” and “ Online learning and solving infinite games with an ERM oracle”. Moreover, it answers several outstanding questions about equilibrium computation and/or learning in games. In particular, for constant values of є: (a) ‍we show that є-approximate correlated equilibria in extensive-form games can be computed efficiently, advancing a long-standing open problem for extensive-form games; see e.g. ‍“ Extensive-form correlated equilibrium: Definition and computational complexity” and “ Polynomial-Time Linear-Swap Regret Minimization in Imperfect-Information Sequential Games”; (b) ‍we show that the query and communication complexities of computing є-approximate correlated equilibria in N-action normal-form games are N · poly log(N) and poly logN respectively, advancing an open problem of “Informational Bounds on Equilibria”; (c) we show that є-approximate correlated equilibria of sparsity poly logN can be computed efficiently, advancing an open problem of “Simple Approximate Equilibria in Large Games”; (d) finally, we show that in the adversarial bandit setting, sublinear swap regret can be achieved in only Õ(N) rounds, advancing an open problem of “From External to Internal Regret” and “Tight Lower Bound and Efficient Reduction for Swap Regret”. 2024-07-10T19:01:35Z 2024-07-10T19:01:35Z 2024-06-10 2024-07-01T07:49:31Z Article http://purl.org/eprint/type/ConferencePaper 979-8-4007-0383-6 https://hdl.handle.net/1721.1/155581 Dagan, Yuval, Daskalakis, Constantinos, Fishelson, Maxwell and Golowich, Noah. 2024. "From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces." PUBLISHER_CC en 10.1145/3618260.3649681 Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ The author(s) application/pdf ACM Association for Computing Machinery
spellingShingle Dagan, Yuval
Daskalakis, Constantinos
Fishelson, Maxwell
Golowich, Noah
From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_full From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_fullStr From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_full_unstemmed From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_short From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_sort from external to swap regret 2 0 an efficient reduction for large action spaces
url https://hdl.handle.net/1721.1/155581
work_keys_str_mv AT daganyuval fromexternaltoswapregret20anefficientreductionforlargeactionspaces
AT daskalakisconstantinos fromexternaltoswapregret20anefficientreductionforlargeactionspaces
AT fishelsonmaxwell fromexternaltoswapregret20anefficientreductionforlargeactionspaces
AT golowichnoah fromexternaltoswapregret20anefficientreductionforlargeactionspaces