Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models

We consider the graphical lasso, a popular optimization problem for learning the sparse representations of high-dimensional datasets, which is well-known to be computationally expensive for large-scale problems. A recent line of results has shown-under mild assumptions-that the sparsity pattern of t...

Full description

Bibliographic Details
Main Authors:	Salar Fattahi, Richard Y. Zhang, Somayeh Sojoudi
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Optimization graphical models numerical algorithms
Online Access:	https://ieeexplore.ieee.org/document/8598839/

_version_	1831525956430331904
author	Salar Fattahi Richard Y. Zhang Somayeh Sojoudi
author_facet	Salar Fattahi Richard Y. Zhang Somayeh Sojoudi
author_sort	Salar Fattahi
collection	DOAJ
description	We consider the graphical lasso, a popular optimization problem for learning the sparse representations of high-dimensional datasets, which is well-known to be computationally expensive for large-scale problems. A recent line of results has shown-under mild assumptions-that the sparsity pattern of the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix. Based on this result, a closed-form solution has been obtained that is optimal when the thresholded sample covariance matrix has an acyclic structure. In this paper, we prove an extension of this result to generalized graphical lasso (GGL), where additional sparsity constraints are imposed based on prior knowledge. Furthermore, we describe a recursive closed-form solution for the problem when the thresholded sample covariance matrix is chordal. By building upon this result, we describe a novel Newton-Conjugate Gradient algorithm that can efficiently solve the GGL with general structures. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $\epsilon $ -accurate solution in $O(n\log (1/\epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve instances with as many as 200000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.
first_indexed	2024-12-16T16:00:27Z
format	Article
id	doaj.art-561aed1d7f6f47d593641a4b506fcdfc
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-16T16:00:27Z
publishDate	2019-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-561aed1d7f6f47d593641a4b506fcdfc2022-12-21T22:25:28ZengIEEEIEEE Access2169-35362019-01-017126581267210.1109/ACCESS.2018.28905838598839Linear-Time Algorithm for Learning Large-Scale Sparse Graphical ModelsSalar Fattahi0Richard Y. Zhang1Somayeh Sojoudi2https://orcid.org/0000-0001-7177-7712Department of Industrial Engineering and Operations Research, University of California at Berkeley, Berkeley, CA, USADepartment of Industrial Engineering and Operations Research, University of California at Berkeley, Berkeley, CA, USADepartments of Electrical Engineering and Computer Sciences and Mechanical Engineering, University of California at Berkeley, Berkeley, CA, USAWe consider the graphical lasso, a popular optimization problem for learning the sparse representations of high-dimensional datasets, which is well-known to be computationally expensive for large-scale problems. A recent line of results has shown-under mild assumptions-that the sparsity pattern of the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix. Based on this result, a closed-form solution has been obtained that is optimal when the thresholded sample covariance matrix has an acyclic structure. In this paper, we prove an extension of this result to generalized graphical lasso (GGL), where additional sparsity constraints are imposed based on prior knowledge. Furthermore, we describe a recursive closed-form solution for the problem when the thresholded sample covariance matrix is chordal. By building upon this result, we describe a novel Newton-Conjugate Gradient algorithm that can efficiently solve the GGL with general structures. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $\epsilon $ -accurate solution in $O(n\log (1/\epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve instances with as many as 200000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.https://ieeexplore.ieee.org/document/8598839/Optimizationgraphical modelsnumerical algorithms
spellingShingle	Salar Fattahi Richard Y. Zhang Somayeh Sojoudi Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models IEEE Access Optimization graphical models numerical algorithms
title	Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models
title_full	Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models
title_fullStr	Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models
title_full_unstemmed	Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models
title_short	Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models
title_sort	linear time algorithm for learning large scale sparse graphical models
topic	Optimization graphical models numerical algorithms
url	https://ieeexplore.ieee.org/document/8598839/
work_keys_str_mv	AT salarfattahi lineartimealgorithmforlearninglargescalesparsegraphicalmodels AT richardyzhang lineartimealgorithmforlearninglargescalesparsegraphicalmodels AT somayehsojoudi lineartimealgorithmforlearninglargescalesparsegraphicalmodels

Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models

Similar Items