Learning graphical models for hypothesis testing and classification

Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating high-dimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary classification by discriminatively learning such models from labeled traini...

Full description

Bibliographic Details
Main Authors:	Tan, Vincent Yan Fu, Sanghavi, Sujay, Fisher, John W., III, Willsky, Alan S.
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	Institute of Electrical and Electronics Engineers (IEEE) 2012
Online Access:	http://hdl.handle.net/1721.1/73608 https://orcid.org/0000-0003-4844-3495 https://orcid.org/0000-0003-0149-5888

_version_	1826202002552520704
author	Tan, Vincent Yan Fu Sanghavi, Sujay Fisher, John W., III Willsky, Alan S.
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Tan, Vincent Yan Fu Sanghavi, Sujay Fisher, John W., III Willsky, Alan S.
author_sort	Tan, Vincent Yan Fu
collection	MIT
description	Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating high-dimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary classification by discriminatively learning such models from labeled training data, i.e., using both positive and negative samples to optimize for the structures of the two models. We motivate why it is difficult to adapt existing generative methods, and propose an alternative method consisting of two parts. First, we develop a novel method to learn tree-structured graphical models which optimizes an approximation of the log-likelihood ratio. We also formulate a joint objective to learn a nested sequence of optimal forests-structured models. Second, we construct a classifier by using ideas from boosting to learn a set of discriminative trees. The final classifier can interpreted as a likelihood ratio test between two models with a larger set of pairwise features. We use cross-validation to determine the optimal number of edges in the final model. The algorithm presented in this paper also provides a method to identify a subset of the edges that are most salient for discrimination. Experiments show that the proposed procedure outperforms generative methods such as Tree Augmented Naïve Bayes and Chow-Liu as well as their boosted counterparts.
first_indexed	2024-09-23T12:00:19Z
format	Article
id	mit-1721.1/73608
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T12:00:19Z
publishDate	2012
publisher	Institute of Electrical and Electronics Engineers (IEEE)
record_format	dspace
spelling	mit-1721.1/736082022-09-27T23:24:35Z Learning graphical models for hypothesis testing and classification Tan, Vincent Yan Fu Sanghavi, Sujay Fisher, John W., III Willsky, Alan S. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Massachusetts Institute of Technology. Stochastic Systems Group Tan, Vincent Yan Fu Fisher, John W., III Willsky, Alan S. Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating high-dimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary classification by discriminatively learning such models from labeled training data, i.e., using both positive and negative samples to optimize for the structures of the two models. We motivate why it is difficult to adapt existing generative methods, and propose an alternative method consisting of two parts. First, we develop a novel method to learn tree-structured graphical models which optimizes an approximation of the log-likelihood ratio. We also formulate a joint objective to learn a nested sequence of optimal forests-structured models. Second, we construct a classifier by using ideas from boosting to learn a set of discriminative trees. The final classifier can interpreted as a likelihood ratio test between two models with a larger set of pairwise features. We use cross-validation to determine the optimal number of edges in the final model. The algorithm presented in this paper also provides a method to identify a subset of the edges that are most salient for discrimination. Experiments show that the proposed procedure outperforms generative methods such as Tree Augmented Naïve Bayes and Chow-Liu as well as their boosted counterparts. United States. Air Force Office of Scientific Research (Grant FA9550-08-1-1080) United States. Army Research Office. Multidisciplinary University Research Initiative (Grant W911NF-06-1-0076) United States. Air Force Office of Scientific Research. Multidisciplinary University Research Initiative (Grant FA9550-06-1-0324) Singapore. Agency for Science, Technology and Research United States. Air Force Research Laboratory (Award No. FA8650-07-D-1220) 2012-10-04T16:50:38Z 2012-10-04T16:50:38Z 2010-07 2010-02 Article http://purl.org/eprint/type/JournalArticle 1053-587X 1941-0476 http://hdl.handle.net/1721.1/73608 Tan, Vincent Y. F. et al. “Learning Graphical Models for Hypothesis Testing and Classification.” IEEE Transactions on Signal Processing 58.11 (2010): 5481–5495. © Copyright 2010 IEEE https://orcid.org/0000-0003-4844-3495 https://orcid.org/0000-0003-0149-5888 en_US http://dx.doi.org/10.1109/TSP.2010.2059019 IEEE Transactions on Signal Processing Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Institute of Electrical and Electronics Engineers (IEEE) IEEE
spellingShingle	Tan, Vincent Yan Fu Sanghavi, Sujay Fisher, John W., III Willsky, Alan S. Learning graphical models for hypothesis testing and classification
title	Learning graphical models for hypothesis testing and classification
title_full	Learning graphical models for hypothesis testing and classification
title_fullStr	Learning graphical models for hypothesis testing and classification
title_full_unstemmed	Learning graphical models for hypothesis testing and classification
title_short	Learning graphical models for hypothesis testing and classification
title_sort	learning graphical models for hypothesis testing and classification
url	http://hdl.handle.net/1721.1/73608 https://orcid.org/0000-0003-4844-3495 https://orcid.org/0000-0003-0149-5888
work_keys_str_mv	AT tanvincentyanfu learninggraphicalmodelsforhypothesistestingandclassification AT sanghavisujay learninggraphicalmodelsforhypothesistestingandclassification AT fisherjohnwiii learninggraphicalmodelsforhypothesistestingandclassification AT willskyalans learninggraphicalmodelsforhypothesistestingandclassification

Learning graphical models for hypothesis testing and classification

Similar Items