Subspace and graph methods to leverage auxiliary data for limited target data multi-class classification, applied to speaker verification

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.

Detalhes bibliográficos
Autor principal: Karam, Zahi Nadim
Outros Autores: William M. Campbell and Alan V. Oppenheim.
Formato: Tese
Idioma:eng
Publicado em: Massachusetts Institute of Technology 2011
Assuntos:
Acesso em linha:http://hdl.handle.net/1721.1/66009
_version_ 1826189246023598080
author Karam, Zahi Nadim
author2 William M. Campbell and Alan V. Oppenheim.
author_facet William M. Campbell and Alan V. Oppenheim.
Karam, Zahi Nadim
author_sort Karam, Zahi Nadim
collection MIT
description Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.
first_indexed 2024-09-23T08:11:47Z
format Thesis
id mit-1721.1/66009
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T08:11:47Z
publishDate 2011
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/660092019-04-09T17:16:00Z Subspace and graph methods to leverage auxiliary data for limited target data multi-class classification, applied to speaker verification Karam, Zahi Nadim William M. Campbell and Alan V. Oppenheim. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. Cataloged from PDF version of thesis. Includes bibliographical references (p. 127-130). Multi-class classification can be adversely affected by the absence of sufficient target (in-class) instances for training. Such cases arise in face recognition, speaker verification, and document classification, among others. Auxiliary data-sets, which contain a diverse sampling of non-target instances, are leveraged in this thesis using subspace and graph methods to improve classification where target data is limited. The auxiliary data is used to define a compact representation that maps instances into a vector space where inner products quantify class similarity. Within this space, an estimate of the subspace that constitutes within-class variability (e.g. the recording channel in speaker verification or the illumination conditions in face recognition) can be obtained using class-labeled auxiliary data. This thesis proposes a way to incorporate this estimate into the SVM framework to perform nuisance compensation, thus improving classification performance. Another contribution is a framework that combines mapping and compensation into a single linear comparison, which motivates computationally inexpensive and accurate comparison functions. A key aspect of the work takes advantage of efficient pairwise comparisons between the training, test, and auxiliary instances to characterize their interaction within the vector space, and exploits it for improved classification in three ways. The first uses the local variability around the train and test instances to reduce false-alarms. The second assumes the instances lie on a low-dimensional manifold and uses the distances along the manifold. The third extracts relational features from a similarity graph where nodes correspond to the training, test and auxiliary instances. To quantify the merit of the proposed techniques, results of experiments in speaker verification are presented where only a single target recording is provided to train the classifier. Experiments are preformed on standard NIST corpora and methods are compared using standard evalutation metrics: detection error trade-off curves, minimum decision costs, and equal error rates. by Zahi Nadim Karam. Ph.D. 2011-09-27T18:31:56Z 2011-09-27T18:31:56Z 2011 2011 Thesis http://hdl.handle.net/1721.1/66009 751924501 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 130 p. application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Karam, Zahi Nadim
Subspace and graph methods to leverage auxiliary data for limited target data multi-class classification, applied to speaker verification
title Subspace and graph methods to leverage auxiliary data for limited target data multi-class classification, applied to speaker verification
title_full Subspace and graph methods to leverage auxiliary data for limited target data multi-class classification, applied to speaker verification
title_fullStr Subspace and graph methods to leverage auxiliary data for limited target data multi-class classification, applied to speaker verification
title_full_unstemmed Subspace and graph methods to leverage auxiliary data for limited target data multi-class classification, applied to speaker verification
title_short Subspace and graph methods to leverage auxiliary data for limited target data multi-class classification, applied to speaker verification
title_sort subspace and graph methods to leverage auxiliary data for limited target data multi class classification applied to speaker verification
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/66009
work_keys_str_mv AT karamzahinadim subspaceandgraphmethodstoleverageauxiliarydataforlimitedtargetdatamulticlassclassificationappliedtospeakerverification