When cyclic coordinate descent outperforms randomized coordinate descent

The coordinate descent (CD) method is a classical optimization algorithm that has seen a revival of interest because of its competitive performance in machine learning applications. A number of recent papers provided convergence rate estimates for their deterministic (cyclic) and randomized variants...

Full description

Bibliographic Details
Main Authors:	Gurbuzbalaban, Mert, Ozdaglar, Asuman E, Parrilo, Pablo A., Vanli, Nuri Denizcan
Other Authors:	Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
Format:	Article
Language:	English
Published:	Neural Information Processing Systems Foundation, Inc. 2019
Online Access:	https://hdl.handle.net/1721.1/121536

_version_	1826191188849328128
author	Gurbuzbalaban, Mert Ozdaglar, Asuman E Parrilo, Pablo A. Vanli, Nuri Denizcan
author2	Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
author_facet	Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Gurbuzbalaban, Mert Ozdaglar, Asuman E Parrilo, Pablo A. Vanli, Nuri Denizcan
author_sort	Gurbuzbalaban, Mert
collection	MIT
description	The coordinate descent (CD) method is a classical optimization algorithm that has seen a revival of interest because of its competitive performance in machine learning applications. A number of recent papers provided convergence rate estimates for their deterministic (cyclic) and randomized variants that differ in the selection of update coordinates. These estimates suggest randomized coordinate descent (RCD) performs better than cyclic coordinate descent (CCD), although numerical experiments do not provide clear justification for this comparison. In this paper, we provide examples and more generally problem classes for which CCD (or CD with any deterministic order) is faster than RCD in terms of asymptotic worst-case convergence. Furthermore, we provide lower and upper bounds on the amount of improvement on the rate of CCD relative to RCD, which depends on the deterministic order used. We also provide a characterization of the best deterministic order (that leads to the maximum improvement in convergence rate) in terms of the combinatorial properties of the Hessian matrix of the objective function.
first_indexed	2024-09-23T08:51:51Z
format	Article
id	mit-1721.1/121536
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T08:51:51Z
publishDate	2019
publisher	Neural Information Processing Systems Foundation, Inc.
record_format	dspace
spelling	mit-1721.1/1215362022-09-30T11:48:29Z When cyclic coordinate descent outperforms randomized coordinate descent Gurbuzbalaban, Mert Ozdaglar, Asuman E Parrilo, Pablo A. Vanli, Nuri Denizcan Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science The coordinate descent (CD) method is a classical optimization algorithm that has seen a revival of interest because of its competitive performance in machine learning applications. A number of recent papers provided convergence rate estimates for their deterministic (cyclic) and randomized variants that differ in the selection of update coordinates. These estimates suggest randomized coordinate descent (RCD) performs better than cyclic coordinate descent (CCD), although numerical experiments do not provide clear justification for this comparison. In this paper, we provide examples and more generally problem classes for which CCD (or CD with any deterministic order) is faster than RCD in terms of asymptotic worst-case convergence. Furthermore, we provide lower and upper bounds on the amount of improvement on the rate of CCD relative to RCD, which depends on the deterministic order used. We also provide a characterization of the best deterministic order (that leads to the maximum improvement in convergence rate) in terms of the combinatorial properties of the Hessian matrix of the objective function. National Science Foundation (U.S.). Division of Materials Research (DMS-1723085) United States. Defense Advanced Research Projects Agency (Foundations of Scalable Statistical Learning grant) 2019-07-09T14:50:30Z 2019-07-09T14:50:30Z 2017-12 2019-06-28T16:12:25Z Article http://purl.org/eprint/type/ConferencePaper 1049-5258 https://hdl.handle.net/1721.1/121536 Gürbüzbalaban, Mert, Asuman Ozdaglar, Pablo A. Parrilo and N. Denizcan Vanli. "When cyclic coordinate descent outperforms randomized coordinate descent." Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, Dec. 4-9 2017. en https://papers.nips.cc/paper/7275-when-cyclic-coordinate-descent-outperforms-randomized-coordinate-descent Advances in Neural Information Processing Systems 30 (NIPS 2017) Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Neural Information Processing Systems Foundation, Inc. Neural Information Processing Systems (NIPS)
spellingShingle	Gurbuzbalaban, Mert Ozdaglar, Asuman E Parrilo, Pablo A. Vanli, Nuri Denizcan When cyclic coordinate descent outperforms randomized coordinate descent
title	When cyclic coordinate descent outperforms randomized coordinate descent
title_full	When cyclic coordinate descent outperforms randomized coordinate descent
title_fullStr	When cyclic coordinate descent outperforms randomized coordinate descent
title_full_unstemmed	When cyclic coordinate descent outperforms randomized coordinate descent
title_short	When cyclic coordinate descent outperforms randomized coordinate descent
title_sort	when cyclic coordinate descent outperforms randomized coordinate descent
url	https://hdl.handle.net/1721.1/121536
work_keys_str_mv	AT gurbuzbalabanmert whencycliccoordinatedescentoutperformsrandomizedcoordinatedescent AT ozdaglarasumane whencycliccoordinatedescentoutperformsrandomizedcoordinatedescent AT parrilopabloa whencycliccoordinatedescentoutperformsrandomizedcoordinatedescent AT vanlinuridenizcan whencycliccoordinatedescentoutperformsrandomizedcoordinatedescent

When cyclic coordinate descent outperforms randomized coordinate descent

Similar Items