The statistical complexity of early-stopped mirror descent

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrained mirror descent algorithms applied to the...

Full description

Bibliographic Details
Main Authors:	Kanade, V, Rebeschini, P, Vaškevičius, T
Format:	Journal article
Language:	English
Published:	Oxford University Press 2023

_version_	1797111936224067584
author	Kanade, V Rebeschini, P Vaškevičius, T
author_facet	Kanade, V Rebeschini, P Vaškevičius, T
author_sort	Kanade, V
collection	OXFORD
description	Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrained mirror descent algorithms applied to the unregularized empirical risk. We consider the set-up of learning linear models and kernel methods for strongly convex and Lipschitz loss functions while imposing only boundedness conditions on the unknown data-generating mechanism. By completing an inequality that characterizes convexity for the squared loss, we identify an intrinsic link between offset Rademacher complexities and potential-based convergence analysis of mirror descent methods. Our observation immediately yields excess risk guarantees for the path traced by the iterates of mirror descent in terms of offset complexities of certain function classes depending only on the choice of the mirror map, initialization point, step size and the number of iterations. We apply our theory to recover, in a clean and elegant manner via rather short proofs, some of the recent results in the implicit regularization literature while also showing how to improve upon them in some settings.
first_indexed	2024-03-07T08:17:22Z
format	Journal article
id	oxford-uuid:90240064-bf6a-405f-9a4c-f65e4220c038
institution	University of Oxford
language	English
last_indexed	2024-03-07T08:17:22Z
publishDate	2023
publisher	Oxford University Press
record_format	dspace
spelling	oxford-uuid:90240064-bf6a-405f-9a4c-f65e4220c0382024-01-05T12:18:37ZThe statistical complexity of early-stopped mirror descentJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:90240064-bf6a-405f-9a4c-f65e4220c038EnglishSymplectic ElementsOxford University Press2023Kanade, VRebeschini, PVaškevičius, TRecently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrained mirror descent algorithms applied to the unregularized empirical risk. We consider the set-up of learning linear models and kernel methods for strongly convex and Lipschitz loss functions while imposing only boundedness conditions on the unknown data-generating mechanism. By completing an inequality that characterizes convexity for the squared loss, we identify an intrinsic link between offset Rademacher complexities and potential-based convergence analysis of mirror descent methods. Our observation immediately yields excess risk guarantees for the path traced by the iterates of mirror descent in terms of offset complexities of certain function classes depending only on the choice of the mirror map, initialization point, step size and the number of iterations. We apply our theory to recover, in a clean and elegant manner via rather short proofs, some of the recent results in the implicit regularization literature while also showing how to improve upon them in some settings.
spellingShingle	Kanade, V Rebeschini, P Vaškevičius, T The statistical complexity of early-stopped mirror descent
title	The statistical complexity of early-stopped mirror descent
title_full	The statistical complexity of early-stopped mirror descent
title_fullStr	The statistical complexity of early-stopped mirror descent
title_full_unstemmed	The statistical complexity of early-stopped mirror descent
title_short	The statistical complexity of early-stopped mirror descent
title_sort	statistical complexity of early stopped mirror descent
work_keys_str_mv	AT kanadev thestatisticalcomplexityofearlystoppedmirrordescent AT rebeschinip thestatisticalcomplexityofearlystoppedmirrordescent AT vaskeviciust thestatisticalcomplexityofearlystoppedmirrordescent AT kanadev statisticalcomplexityofearlystoppedmirrordescent AT rebeschinip statisticalcomplexityofearlystoppedmirrordescent AT vaskeviciust statisticalcomplexityofearlystoppedmirrordescent

The statistical complexity of early-stopped mirror descent

Similar Items