Dual space preconditioning for gradient descent

The conditions of relative smoothness and relative strong convexity were recently introduced for the analysis of Bregman gradient methods for convex optimization. We introduce a generalized left-preconditioning method for gradient descent and show that its convergence on an essentially smooth convex...

Full description

Bibliographic Details
Main Authors:	Maddison, CJ, Paulin, D, Teh, YW, Doucet, A
Format:	Journal article
Language:	English
Published:	Society for Industrial and Applied Mathematics 2021

_version_	1826299460474372096
author	Maddison, CJ Paulin, D Teh, YW Doucet, A
author_facet	Maddison, CJ Paulin, D Teh, YW Doucet, A
author_sort	Maddison, CJ
collection	OXFORD
description	The conditions of relative smoothness and relative strong convexity were recently introduced for the analysis of Bregman gradient methods for convex optimization. We introduce a generalized left-preconditioning method for gradient descent and show that its convergence on an essentially smooth convex objective function can be guaranteed via an application of relative smoothness in the dual space. Our relative smoothness assumption is between the designed preconditioner and the convex conjugate of the objective, and it generalizes the typical Lipschitz gradient assumption. Under dual relative strong convexity, we obtain linear convergence with a generalized condition number that is invariant under horizontal translations, distinguishing it from Bregman gradient methods. Thus, in principle our method is capable of improving the conditioning of gradient descent on problems with a non-Lipschitz gradient or nonstrongly convex structure. We demonstrate our method on $p$-norm regression and exponential penalty function minimization.
first_indexed	2024-03-07T05:02:16Z
format	Journal article
id	oxford-uuid:d8afa7a6-7e95-4e67-834f-16a643ca01c4
institution	University of Oxford
language	English
last_indexed	2024-03-07T05:02:16Z
publishDate	2021
publisher	Society for Industrial and Applied Mathematics
record_format	dspace
spelling	oxford-uuid:d8afa7a6-7e95-4e67-834f-16a643ca01c42022-03-27T08:50:36ZDual space preconditioning for gradient descentJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:d8afa7a6-7e95-4e67-834f-16a643ca01c4EnglishSymplectic ElementsSociety for Industrial and Applied Mathematics2021Maddison, CJPaulin, DTeh, YWDoucet, AThe conditions of relative smoothness and relative strong convexity were recently introduced for the analysis of Bregman gradient methods for convex optimization. We introduce a generalized left-preconditioning method for gradient descent and show that its convergence on an essentially smooth convex objective function can be guaranteed via an application of relative smoothness in the dual space. Our relative smoothness assumption is between the designed preconditioner and the convex conjugate of the objective, and it generalizes the typical Lipschitz gradient assumption. Under dual relative strong convexity, we obtain linear convergence with a generalized condition number that is invariant under horizontal translations, distinguishing it from Bregman gradient methods. Thus, in principle our method is capable of improving the conditioning of gradient descent on problems with a non-Lipschitz gradient or nonstrongly convex structure. We demonstrate our method on $p$-norm regression and exponential penalty function minimization.
spellingShingle	Maddison, CJ Paulin, D Teh, YW Doucet, A Dual space preconditioning for gradient descent
title	Dual space preconditioning for gradient descent
title_full	Dual space preconditioning for gradient descent
title_fullStr	Dual space preconditioning for gradient descent
title_full_unstemmed	Dual space preconditioning for gradient descent
title_short	Dual space preconditioning for gradient descent
title_sort	dual space preconditioning for gradient descent
work_keys_str_mv	AT maddisoncj dualspacepreconditioningforgradientdescent AT paulind dualspacepreconditioningforgradientdescent AT tehyw dualspacepreconditioningforgradientdescent AT douceta dualspacepreconditioningforgradientdescent

Dual space preconditioning for gradient descent

Similar Items