From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces

STOC ’24, June 24–28, 2024, Vancouver, BC, Canada

Bibliographic Details
Main Authors:	Dagan, Yuval, Daskalakis, Constantinos, Fishelson, Maxwell, Golowich, Noah
Format:	Article
Language:	English
Published:	ACM 2024
Online Access:	https://hdl.handle.net/1721.1/155581

_version_	1811086527032197120
author	Dagan, Yuval Daskalakis, Constantinos Fishelson, Maxwell Golowich, Noah
author_facet	Dagan, Yuval Daskalakis, Constantinos Fishelson, Maxwell Golowich, Noah
author_sort	Dagan, Yuval
collection	MIT
description	STOC ’24, June 24–28, 2024, Vancouver, BC, Canada
first_indexed	2024-09-23T13:27:20Z
format	Article
id	mit-1721.1/155581
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T13:27:20Z
publishDate	2024
publisher	ACM
record_format	dspace
spelling	mit-1721.1/1555812024-09-09T04:42:26Z From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces Dagan, Yuval Daskalakis, Constantinos Fishelson, Maxwell Golowich, Noah STOC ’24, June 24–28, 2024, Vancouver, BC, Canada We provide a novel reduction from swap-regret minimization to external-regret minimization, which improves upon the classical reductions of Blum-Mansour and Stoltz-Lugosi in that it does not require finiteness of the space of actions. We show that, whenever there exists a no-external-regret algorithm for some hypothesis class, there must also exist a no-swap-regret algorithm for that same class. For the problem of learning with expert advice, our result implies that it is possible to guarantee that the swap regret is bounded by є after (logN)Õ(1/є) rounds and with O(N) per iteration complexity, where N is the number of experts, while the classical reductions of Blum-Mansour and Stoltz-Lugosi require at least Ω(N/є2) rounds and at least Ω(N3) total computational cost. Our result comes with an associated lower bound, which—in contrast to that of Blum-Mansour—holds for oblivious and ℓ1-constrained adversaries and learners that can employ distributions over experts, showing that the number of rounds must be Ω(N/є2) or exponential in 1/є. Our reduction implies that, if no-regret learning is possible in some game, then this game must have approximate correlated equilibria, of arbitrarily good approximation. This strengthens the folklore implication of no-regret learning that approximate coarse correlated equilibria exist. Importantly, it provides a sufficient condition for the existence of approximate correlated equilibrium which vastly extends the requirement that the action set is finite or the requirement that the action set is compact and the utility functions are continuous, allowing for games with finite Littlestone or finite sequential fat shattering dimension, thus answering a question left open in “Fast rates for nonparametric online learning: from realizability to learning in games” and “ Online learning and solving infinite games with an ERM oracle”. Moreover, it answers several outstanding questions about equilibrium computation and/or learning in games. In particular, for constant values of є: (a) ‍we show that є-approximate correlated equilibria in extensive-form games can be computed efficiently, advancing a long-standing open problem for extensive-form games; see e.g. ‍“ Extensive-form correlated equilibrium: Definition and computational complexity” and “ Polynomial-Time Linear-Swap Regret Minimization in Imperfect-Information Sequential Games”; (b) ‍we show that the query and communication complexities of computing є-approximate correlated equilibria in N-action normal-form games are N · poly log(N) and poly logN respectively, advancing an open problem of “Informational Bounds on Equilibria”; (c) we show that є-approximate correlated equilibria of sparsity poly logN can be computed efficiently, advancing an open problem of “Simple Approximate Equilibria in Large Games”; (d) finally, we show that in the adversarial bandit setting, sublinear swap regret can be achieved in only Õ(N) rounds, advancing an open problem of “From External to Internal Regret” and “Tight Lower Bound and Efficient Reduction for Swap Regret”. 2024-07-10T19:01:35Z 2024-07-10T19:01:35Z 2024-06-10 2024-07-01T07:49:31Z Article http://purl.org/eprint/type/ConferencePaper 979-8-4007-0383-6 https://hdl.handle.net/1721.1/155581 Dagan, Yuval, Daskalakis, Constantinos, Fishelson, Maxwell and Golowich, Noah. 2024. "From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces." PUBLISHER_CC en 10.1145/3618260.3649681 Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ The author(s) application/pdf ACM Association for Computing Machinery
spellingShingle	Dagan, Yuval Daskalakis, Constantinos Fishelson, Maxwell Golowich, Noah From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title	From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_full	From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_fullStr	From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_full_unstemmed	From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_short	From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
title_sort	from external to swap regret 2 0 an efficient reduction for large action spaces
url	https://hdl.handle.net/1721.1/155581
work_keys_str_mv	AT daganyuval fromexternaltoswapregret20anefficientreductionforlargeactionspaces AT daskalakisconstantinos fromexternaltoswapregret20anefficientreductionforlargeactionspaces AT fishelsonmaxwell fromexternaltoswapregret20anefficientreductionforlargeactionspaces AT golowichnoah fromexternaltoswapregret20anefficientreductionforlargeactionspaces

From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces

Similar Items