Scaling laws for learning high-dimensional Markov forest distributions

The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learn...

Full description

Bibliographic Details
Main Authors:	Willsky, Alan S., Tan, Vincent Yan Fu, Anandkumar, Animashree
Other Authors:	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format:	Article
Language:	en_US
Published:	Institute of Electrical and Electronics Engineers (IEEE) 2012
Online Access:	http://hdl.handle.net/1721.1/73590 https://orcid.org/0000-0003-0149-5888

_version_	1811088746859200512
author	Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree
author2	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree
author_sort	Willsky, Alan S.
collection	MIT
description	The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n, d, k) are given for the algorithm to be structurally consistent. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.
first_indexed	2024-09-23T14:07:05Z
format	Article
id	mit-1721.1/73590
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T14:07:05Z
publishDate	2012
publisher	Institute of Electrical and Electronics Engineers (IEEE)
record_format	dspace
spelling	mit-1721.1/735902022-10-01T19:17:12Z Scaling laws for learning high-dimensional Markov forest distributions Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n, d, k) are given for the algorithm to be structurally consistent. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning. United States. Air Force Office of Scientific Research (Grant FA9559-08-1- 1080) United States. Army Research Office. Multidisciplinary University Research Initiative (Grant W911NF-06-1-0076) United States. Army Research Office. Multidisciplinary University Research Initiative (Grant FA9550-06-1-0324) Singapore. Agency for Science, Technology and Research 2012-10-04T13:43:41Z 2012-10-04T13:43:41Z 2010-09 2010-09 Article http://purl.org/eprint/type/ConferencePaper 978-1-4244-8215-3 http://hdl.handle.net/1721.1/73590 Tan, Vincent Y. F., Animashree Anandkumar, and Alan S. Wi. "Scaling laws for learning high-dimensional Markov forest distributions" 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010. 712–718. https://orcid.org/0000-0003-0149-5888 en_US http://dx.doi.org/10.1109/ALLERTON.2010.5706977 Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010 Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) Other University Web Domain
spellingShingle	Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree Scaling laws for learning high-dimensional Markov forest distributions
title	Scaling laws for learning high-dimensional Markov forest distributions
title_full	Scaling laws for learning high-dimensional Markov forest distributions
title_fullStr	Scaling laws for learning high-dimensional Markov forest distributions
title_full_unstemmed	Scaling laws for learning high-dimensional Markov forest distributions
title_short	Scaling laws for learning high-dimensional Markov forest distributions
title_sort	scaling laws for learning high dimensional markov forest distributions
url	http://hdl.handle.net/1721.1/73590 https://orcid.org/0000-0003-0149-5888
work_keys_str_mv	AT willskyalans scalinglawsforlearninghighdimensionalmarkovforestdistributions AT tanvincentyanfu scalinglawsforlearninghighdimensionalmarkovforestdistributions AT anandkumaranimashree scalinglawsforlearninghighdimensionalmarkovforestdistributions

Scaling laws for learning high-dimensional Markov forest distributions

Similar Items