Latent Variable Graphical Model Selection Via Convex Optimization

Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistic...

Full description

Bibliographic Details
Main Authors:	Chandrasekaran, Venkat, Parrilo, Pablo A., Willsky, Alan S.
Other Authors:	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format:	Article
Language:	en_US
Published:	Institute of Electrical and Electronics Engineers (IEEE) 2012
Online Access:	http://hdl.handle.net/1721.1/72612 https://orcid.org/0000-0003-1132-8477 https://orcid.org/0000-0003-0149-5888

_version_	1811084372556644352
author	Chandrasekaran, Venkat Parrilo, Pablo A. Willsky, Alan S.
author2	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Chandrasekaran, Venkat Parrilo, Pablo A. Willsky, Alan S.
author_sort	Chandrasekaran, Venkat
collection	MIT
description	Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistical model over the entire collection of variables? We address this question in the setting in which the latent and observed variables are jointly Gaussian, with the conditional statistics of the observed variables conditioned on the latent variables being specified by a graphical model. As a first step we give natural conditions under which such latent-variable Gaussian graphical models are identifiable given marginal statistics of only the observed variables. Essentially these conditions require that the conditional graphical model among the observed variables is sparse, while the effect of the latent variables is “spread out” over most of the observed variables. Next we propose a tractable convex program based on regularized maximum-likelihood for model selection in this latent-variable setting; the regularizer uses both the ℓ[subscript 1] norm and the nuclear norm. Our modeling framework can be viewed as a combination of dimensionality reduction (to identify latent variables) and graphical modeling (to capture remaining statistical structure not attributable to the latent variables), and it consistently estimates both the number of hidden components and the conditional graphical model structure among the observed variables. These results are applicable in the high-dimensional setting in which the number of latent/observed variables grows with the number of samples of the observed variables. The geometric properties of the algebraic varieties of sparse matrices and of low-rank matrices play an important role in our analysis.
first_indexed	2024-09-23T12:49:41Z
format	Article
id	mit-1721.1/72612
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T12:49:41Z
publishDate	2012
publisher	Institute of Electrical and Electronics Engineers (IEEE)
record_format	dspace
spelling	mit-1721.1/726122022-09-28T10:15:37Z Latent Variable Graphical Model Selection Via Convex Optimization Chandrasekaran, Venkat Parrilo, Pablo A. Willsky, Alan S. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Parrilo, Pablo A. Chandrasekaran, Venkat Parrilo, Pablo A. Willsky, Alan S. Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistical model over the entire collection of variables? We address this question in the setting in which the latent and observed variables are jointly Gaussian, with the conditional statistics of the observed variables conditioned on the latent variables being specified by a graphical model. As a first step we give natural conditions under which such latent-variable Gaussian graphical models are identifiable given marginal statistics of only the observed variables. Essentially these conditions require that the conditional graphical model among the observed variables is sparse, while the effect of the latent variables is “spread out” over most of the observed variables. Next we propose a tractable convex program based on regularized maximum-likelihood for model selection in this latent-variable setting; the regularizer uses both the ℓ[subscript 1] norm and the nuclear norm. Our modeling framework can be viewed as a combination of dimensionality reduction (to identify latent variables) and graphical modeling (to capture remaining statistical structure not attributable to the latent variables), and it consistently estimates both the number of hidden components and the conditional graphical model structure among the observed variables. These results are applicable in the high-dimensional setting in which the number of latent/observed variables grows with the number of samples of the observed variables. The geometric properties of the algebraic varieties of sparse matrices and of low-rank matrices play an important role in our analysis. 2012-09-11T15:10:32Z 2012-09-11T15:10:32Z 2011-02 2010-09 Article http://purl.org/eprint/type/ConferencePaper 978-1-4244-8215-3 http://hdl.handle.net/1721.1/72612 Chandrasekaran, Venkat, Pablo A. Parrilo, and Alan S. Willsky. “Latent Variable Graphical Model Selection via Convex Optimization.” 48th Annual Allerton Conference on Communication, Control, and Computing 2010 (Allerton). 1610–1613. © Copyright 2010 IEEE https://orcid.org/0000-0003-1132-8477 https://orcid.org/0000-0003-0149-5888 en_US http://dx.doi.org/10.1109/ALLERTON.2010.5707106 48th Annual Allerton Conference on Communication, Control, and Computing 2010 (Allerton) Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Institute of Electrical and Electronics Engineers (IEEE) IEEE
spellingShingle	Chandrasekaran, Venkat Parrilo, Pablo A. Willsky, Alan S. Latent Variable Graphical Model Selection Via Convex Optimization
title	Latent Variable Graphical Model Selection Via Convex Optimization
title_full	Latent Variable Graphical Model Selection Via Convex Optimization
title_fullStr	Latent Variable Graphical Model Selection Via Convex Optimization
title_full_unstemmed	Latent Variable Graphical Model Selection Via Convex Optimization
title_short	Latent Variable Graphical Model Selection Via Convex Optimization
title_sort	latent variable graphical model selection via convex optimization
url	http://hdl.handle.net/1721.1/72612 https://orcid.org/0000-0003-1132-8477 https://orcid.org/0000-0003-0149-5888
work_keys_str_mv	AT chandrasekaranvenkat latentvariablegraphicalmodelselectionviaconvexoptimization AT parrilopabloa latentvariablegraphicalmodelselectionviaconvexoptimization AT willskyalans latentvariablegraphicalmodelselectionviaconvexoptimization

Latent Variable Graphical Model Selection Via Convex Optimization

Similar Items