The statistical complexity of early-stopped mirror descent

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrained mirror descent algorithms applied to the...

Full description

Bibliographic Details
Main Authors:	Vaškevičius, T, Kanade, V, Rebeschini, P
Format:	Journal article
Language:	English
Published:	Neural Information Processing Systems Foundation, Inc. 2020

_version_	1826272666154172416
author	Vaškevičius, T Kanade, V Rebeschini, P
author_facet	Vaškevičius, T Kanade, V Rebeschini, P
author_sort	Vaškevičius, T
collection	OXFORD
description	Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrained mirror descent algorithms applied to the unregularized empirical risk with the squared loss for linear models and kernel methods. By completing an inequality that characterizes convexity for the squared loss, we identify an intrinsic link between offset Rademacher complexities and potential-based convergence analysis of mirror descent methods. Our observation immediately yields excess risk guarantees for the path traced by the iterates of mirror descent in terms of offset complexities of certain function classes depending only on the choice of the mirror map, initialization point, step-size, and the number of iterations. We apply our theory to recover, in a rather clean and elegant manner via rather short proofs, some of the recent results in the implicit regularization literature, while also showing how to improve upon them in some settings.
first_indexed	2024-03-06T22:16:11Z
format	Journal article
id	oxford-uuid:5375520d-881a-4e5f-bd18-016874a6c86e
institution	University of Oxford
language	English
last_indexed	2024-03-06T22:16:11Z
publishDate	2020
publisher	Neural Information Processing Systems Foundation, Inc.
record_format	dspace
spelling	oxford-uuid:5375520d-881a-4e5f-bd18-016874a6c86e2022-03-26T16:31:49ZThe statistical complexity of early-stopped mirror descentJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:5375520d-881a-4e5f-bd18-016874a6c86eEnglishSymplectic ElementsNeural Information Processing Systems Foundation, Inc.2020Vaškevičius, TKanade, VRebeschini, PRecently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrained mirror descent algorithms applied to the unregularized empirical risk with the squared loss for linear models and kernel methods. By completing an inequality that characterizes convexity for the squared loss, we identify an intrinsic link between offset Rademacher complexities and potential-based convergence analysis of mirror descent methods. Our observation immediately yields excess risk guarantees for the path traced by the iterates of mirror descent in terms of offset complexities of certain function classes depending only on the choice of the mirror map, initialization point, step-size, and the number of iterations. We apply our theory to recover, in a rather clean and elegant manner via rather short proofs, some of the recent results in the implicit regularization literature, while also showing how to improve upon them in some settings.
spellingShingle	Vaškevičius, T Kanade, V Rebeschini, P The statistical complexity of early-stopped mirror descent
title	The statistical complexity of early-stopped mirror descent
title_full	The statistical complexity of early-stopped mirror descent
title_fullStr	The statistical complexity of early-stopped mirror descent
title_full_unstemmed	The statistical complexity of early-stopped mirror descent
title_short	The statistical complexity of early-stopped mirror descent
title_sort	statistical complexity of early stopped mirror descent
work_keys_str_mv	AT vaskeviciust thestatisticalcomplexityofearlystoppedmirrordescent AT kanadev thestatisticalcomplexityofearlystoppedmirrordescent AT rebeschinip thestatisticalcomplexityofearlystoppedmirrordescent AT vaskeviciust statisticalcomplexityofearlystoppedmirrordescent AT kanadev statisticalcomplexityofearlystoppedmirrordescent AT rebeschinip statisticalcomplexityofearlystoppedmirrordescent

The statistical complexity of early-stopped mirror descent

Similar Items