Scaling laws for learning high-dimensional Markov forest distributions

The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learn...

Full description

Bibliographic Details
Main Authors: Willsky, Alan S., Tan, Vincent Yan Fu, Anandkumar, Animashree
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:en_US
Published: Institute of Electrical and Electronics Engineers (IEEE) 2012
Online Access:http://hdl.handle.net/1721.1/73590
https://orcid.org/0000-0003-0149-5888
_version_ 1811088746859200512
author Willsky, Alan S.
Tan, Vincent Yan Fu
Anandkumar, Animashree
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Willsky, Alan S.
Tan, Vincent Yan Fu
Anandkumar, Animashree
author_sort Willsky, Alan S.
collection MIT
description The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n, d, k) are given for the algorithm to be structurally consistent. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.
first_indexed 2024-09-23T14:07:05Z
format Article
id mit-1721.1/73590
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T14:07:05Z
publishDate 2012
publisher Institute of Electrical and Electronics Engineers (IEEE)
record_format dspace
spelling mit-1721.1/735902022-10-01T19:17:12Z Scaling laws for learning high-dimensional Markov forest distributions Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n, d, k) are given for the algorithm to be structurally consistent. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning. United States. Air Force Office of Scientific Research (Grant FA9559-08-1- 1080) United States. Army Research Office. Multidisciplinary University Research Initiative (Grant W911NF-06-1-0076) United States. Army Research Office. Multidisciplinary University Research Initiative (Grant FA9550-06-1-0324) Singapore. Agency for Science, Technology and Research 2012-10-04T13:43:41Z 2012-10-04T13:43:41Z 2010-09 2010-09 Article http://purl.org/eprint/type/ConferencePaper 978-1-4244-8215-3 http://hdl.handle.net/1721.1/73590 Tan, Vincent Y. F., Animashree Anandkumar, and Alan S. Wi. "Scaling laws for learning high-dimensional Markov forest distributions" 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010. 712–718. https://orcid.org/0000-0003-0149-5888 en_US http://dx.doi.org/10.1109/ALLERTON.2010.5706977 Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010 Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) Other University Web Domain
spellingShingle Willsky, Alan S.
Tan, Vincent Yan Fu
Anandkumar, Animashree
Scaling laws for learning high-dimensional Markov forest distributions
title Scaling laws for learning high-dimensional Markov forest distributions
title_full Scaling laws for learning high-dimensional Markov forest distributions
title_fullStr Scaling laws for learning high-dimensional Markov forest distributions
title_full_unstemmed Scaling laws for learning high-dimensional Markov forest distributions
title_short Scaling laws for learning high-dimensional Markov forest distributions
title_sort scaling laws for learning high dimensional markov forest distributions
url http://hdl.handle.net/1721.1/73590
https://orcid.org/0000-0003-0149-5888
work_keys_str_mv AT willskyalans scalinglawsforlearninghighdimensionalmarkovforestdistributions
AT tanvincentyanfu scalinglawsforlearninghighdimensionalmarkovforestdistributions
AT anandkumaranimashree scalinglawsforlearninghighdimensionalmarkovforestdistributions