Global convergence rate of incremental aggregated gradient methods for nonsmooth problems

We analyze the proximal incremental aggregated gradient (PIAG) method for minimizing the sum of a large number of smooth component functions f(x) = Σ[subscript i=1]m f[subscript i](x) and a convex function r(x). Such composite optimization problems arise in a number of machine learning applications...

Full description

Bibliographic Details
Main Authors: Vanli, Nuri Denizcan, Gurbuzbalaban, Mert, Koksal, Asuman E.
Other Authors: Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
Format: Article
Language:en_US
Published: Institute of Electrical and Electronics Engineers (IEEE) 2017
Online Access:http://hdl.handle.net/1721.1/111781
https://orcid.org/0000-0002-0575-2450
https://orcid.org/0000-0002-1827-1285
_version_ 1826199688400863232
author Vanli, Nuri Denizcan
Gurbuzbalaban, Mert
Koksal, Asuman E.
author2 Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
author_facet Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
Vanli, Nuri Denizcan
Gurbuzbalaban, Mert
Koksal, Asuman E.
author_sort Vanli, Nuri Denizcan
collection MIT
description We analyze the proximal incremental aggregated gradient (PIAG) method for minimizing the sum of a large number of smooth component functions f(x) = Σ[subscript i=1]m f[subscript i](x) and a convex function r(x). Such composite optimization problems arise in a number of machine learning applications including regularized regression problems and constrained distributed optimization problems over sensor networks. Our method computes an approximate gradient for the function f(x) by aggregating the component gradients evaluated at outdated iterates over a finite window K and uses a proximal operator with respect to the regularization function r(x) at the intermediate iterate obtained by moving along the approximate gradient. Under the assumptions that f(x) is strongly convex and each f[subscript i](x) is smooth with Lipschitz gradients, we show the first linear convergence rate result for the PIAG method and provide explicit convergence rate estimates that highlight the dependence on the condition number of the problem and the size of the window K over which outdated component gradients are evaluated.
first_indexed 2024-09-23T11:24:27Z
format Article
id mit-1721.1/111781
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T11:24:27Z
publishDate 2017
publisher Institute of Electrical and Electronics Engineers (IEEE)
record_format dspace
spelling mit-1721.1/1117812022-10-01T03:24:45Z Global convergence rate of incremental aggregated gradient methods for nonsmooth problems Vanli, Nuri Denizcan Gurbuzbalaban, Mert Koksal, Asuman E. Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Vanli, Nuri Denizcan Gurbuzbalaban, Mert Koksal, Asuman E. We analyze the proximal incremental aggregated gradient (PIAG) method for minimizing the sum of a large number of smooth component functions f(x) = Σ[subscript i=1]m f[subscript i](x) and a convex function r(x). Such composite optimization problems arise in a number of machine learning applications including regularized regression problems and constrained distributed optimization problems over sensor networks. Our method computes an approximate gradient for the function f(x) by aggregating the component gradients evaluated at outdated iterates over a finite window K and uses a proximal operator with respect to the regularization function r(x) at the intermediate iterate obtained by moving along the approximate gradient. Under the assumptions that f(x) is strongly convex and each f[subscript i](x) is smooth with Lipschitz gradients, we show the first linear convergence rate result for the PIAG method and provide explicit convergence rate estimates that highlight the dependence on the condition number of the problem and the size of the window K over which outdated component gradients are evaluated. 2017-10-04T15:08:51Z 2017-10-04T15:08:51Z 2016-12 Article http://purl.org/eprint/type/ConferencePaper 978-1-5090-1837-6 http://hdl.handle.net/1721.1/111781 Vanli, N. Denizcan et al. “Global Convergence Rate of Incremental Aggregated Gradient Methods for Nonsmooth Problems.” 2016 IEEE 55th Conference on Decision and Control (CDC), December 12-14 2016, Las Vegas, Nevada, USA, Institute of Electrical and Electronics Engineers (IEEE), December 2016: 173-178 © 2016 Institute of Electrical and Electronics Engineers (IEEE) https://orcid.org/0000-0002-0575-2450 https://orcid.org/0000-0002-1827-1285 en_US http://dx.doi.org/10.1109/CDC.2016.7798265 2016 IEEE 55th Conference on Decision and Control (CDC) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) arXiv
spellingShingle Vanli, Nuri Denizcan
Gurbuzbalaban, Mert
Koksal, Asuman E.
Global convergence rate of incremental aggregated gradient methods for nonsmooth problems
title Global convergence rate of incremental aggregated gradient methods for nonsmooth problems
title_full Global convergence rate of incremental aggregated gradient methods for nonsmooth problems
title_fullStr Global convergence rate of incremental aggregated gradient methods for nonsmooth problems
title_full_unstemmed Global convergence rate of incremental aggregated gradient methods for nonsmooth problems
title_short Global convergence rate of incremental aggregated gradient methods for nonsmooth problems
title_sort global convergence rate of incremental aggregated gradient methods for nonsmooth problems
url http://hdl.handle.net/1721.1/111781
https://orcid.org/0000-0002-0575-2450
https://orcid.org/0000-0002-1827-1285
work_keys_str_mv AT vanlinuridenizcan globalconvergencerateofincrementalaggregatedgradientmethodsfornonsmoothproblems
AT gurbuzbalabanmert globalconvergencerateofincrementalaggregatedgradientmethodsfornonsmoothproblems
AT koksalasumane globalconvergencerateofincrementalaggregatedgradientmethodsfornonsmoothproblems