The information regularization framework for semi-supervised learning

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.

Bibliographic Details
Main Author: Corduneanu, Adrian (Adrian Dumitru), 1977-
Other Authors: Tommi Jakkola.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2007
Subjects:
Online Access:http://hdl.handle.net/1721.1/37917
_version_ 1826211022182023168
author Corduneanu, Adrian (Adrian Dumitru), 1977-
author2 Tommi Jakkola.
author_facet Tommi Jakkola.
Corduneanu, Adrian (Adrian Dumitru), 1977-
author_sort Corduneanu, Adrian (Adrian Dumitru), 1977-
collection MIT
description Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.
first_indexed 2024-09-23T14:59:39Z
format Thesis
id mit-1721.1/37917
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T14:59:39Z
publishDate 2007
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/379172019-04-10T16:41:15Z The information regularization framework for semi-supervised learning Corduneanu, Adrian (Adrian Dumitru), 1977- Tommi Jakkola. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006. Includes bibliographical references (p. 147-154). In recent years, the study of classification shifted to algorithms for training the classifier from data that may be missing the class label. While traditional supervised classifiers already have the ability to cope with some incomplete data, the new type of classifiers do not view unlabeled data as an anomaly, and can learn from data sets in which the large majority of training points are unlabeled. Classification with labeled and unlabeled data, or semi-supervised classification, has important practical significance, as training sets with a mix of labeled an unlabeled data are commonplace. In many domains, such as categorization of web pages, it is easier to collect unlabeled data, than to annotate the training points with labels. This thesis is a study of the information regularization method for semi-supervised classification, a unified framework that encompasses many of the common approaches to semi-supervised learning, including parametric models of incomplete data, harmonic graph regularization, redundancy of sufficient features (co-training), and combinations of these principles in a single algorithm. (cont.) We discuss the framework in both parametric and non-parametric settings, as a transductive or inductive classifier, considered as a stand-alone classifier, or applied as post-processing to standard supervised classifiers. We study theoretical properties of the framework, and illustrate it on categorization of web pages, and named-entity recognition. by Adrian Corduneanu. Ph.D. 2007-07-18T13:10:42Z 2007-07-18T13:10:42Z 2006 2006 Thesis http://hdl.handle.net/1721.1/37917 135235565 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 154 p. application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Corduneanu, Adrian (Adrian Dumitru), 1977-
The information regularization framework for semi-supervised learning
title The information regularization framework for semi-supervised learning
title_full The information regularization framework for semi-supervised learning
title_fullStr The information regularization framework for semi-supervised learning
title_full_unstemmed The information regularization framework for semi-supervised learning
title_short The information regularization framework for semi-supervised learning
title_sort information regularization framework for semi supervised learning
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/37917
work_keys_str_mv AT corduneanuadrianadriandumitru1977 theinformationregularizationframeworkforsemisupervisedlearning
AT corduneanuadrianadriandumitru1977 informationregularizationframeworkforsemisupervisedlearning