Scaling laws for learning high-dimensional Markov forest distributions
The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learn...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Institute of Electrical and Electronics Engineers (IEEE)
2012
|
Online Access: | http://hdl.handle.net/1721.1/73590 https://orcid.org/0000-0003-0149-5888 |
_version_ | 1811088746859200512 |
---|---|
author | Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree |
author_sort | Willsky, Alan S. |
collection | MIT |
description | The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n, d, k) are given for the algorithm to be structurally consistent. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning. |
first_indexed | 2024-09-23T14:07:05Z |
format | Article |
id | mit-1721.1/73590 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T14:07:05Z |
publishDate | 2012 |
publisher | Institute of Electrical and Electronics Engineers (IEEE) |
record_format | dspace |
spelling | mit-1721.1/735902022-10-01T19:17:12Z Scaling laws for learning high-dimensional Markov forest distributions Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is structurally consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n, d, k) are given for the algorithm to be structurally consistent. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning. United States. Air Force Office of Scientific Research (Grant FA9559-08-1- 1080) United States. Army Research Office. Multidisciplinary University Research Initiative (Grant W911NF-06-1-0076) United States. Army Research Office. Multidisciplinary University Research Initiative (Grant FA9550-06-1-0324) Singapore. Agency for Science, Technology and Research 2012-10-04T13:43:41Z 2012-10-04T13:43:41Z 2010-09 2010-09 Article http://purl.org/eprint/type/ConferencePaper 978-1-4244-8215-3 http://hdl.handle.net/1721.1/73590 Tan, Vincent Y. F., Animashree Anandkumar, and Alan S. Wi. "Scaling laws for learning high-dimensional Markov forest distributions" 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010. 712–718. https://orcid.org/0000-0003-0149-5888 en_US http://dx.doi.org/10.1109/ALLERTON.2010.5706977 Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010 Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) Other University Web Domain |
spellingShingle | Willsky, Alan S. Tan, Vincent Yan Fu Anandkumar, Animashree Scaling laws for learning high-dimensional Markov forest distributions |
title | Scaling laws for learning high-dimensional Markov forest distributions |
title_full | Scaling laws for learning high-dimensional Markov forest distributions |
title_fullStr | Scaling laws for learning high-dimensional Markov forest distributions |
title_full_unstemmed | Scaling laws for learning high-dimensional Markov forest distributions |
title_short | Scaling laws for learning high-dimensional Markov forest distributions |
title_sort | scaling laws for learning high dimensional markov forest distributions |
url | http://hdl.handle.net/1721.1/73590 https://orcid.org/0000-0003-0149-5888 |
work_keys_str_mv | AT willskyalans scalinglawsforlearninghighdimensionalmarkovforestdistributions AT tanvincentyanfu scalinglawsforlearninghighdimensionalmarkovforestdistributions AT anandkumaranimashree scalinglawsforlearninghighdimensionalmarkovforestdistributions |