Sparse distance metric learning

<p>A good distance metric can improve the accuracy of a nearest neighbour classifier. Xing et al. (2002) proposed distance metric learning to find a linear transformation of the data so that observations of different classes are better separated. For high-dimensional problems where many un-...

Full description

Bibliographic Details
Main Author:	Choy, T
Other Authors:	Meinshausen, N
Format:	Thesis
Language:	English
Published:	2014
Subjects:	Statistics (see also social sciences) Pattern recognition (statistics)

_version_	1826290046999724032
author	Choy, T
author2	Meinshausen, N
author_facet	Meinshausen, N Choy, T
author_sort	Choy, T
collection	OXFORD
description	<p>A good distance metric can improve the accuracy of a nearest neighbour classifier. Xing et al. (2002) proposed distance metric learning to find a linear transformation of the data so that observations of different classes are better separated. For high-dimensional problems where many un-informative variables are present, it is attractive to select a sparse distance metric, both to increase predictive accuracy but also to aid interpretation of the result. In this thesis, we investigate three different types of sparsity assumption for distance metric learning and show that sparse recovery is possible under each type of sparsity assumption with an appropriate choice of L1-type penalty. We show that a lasso penalty promotes learning a transformation matrix having lots of zero entries, a group lasso penalty recovers a transformation matrix having zero rows/columns and a trace norm penalty allows us to learn a low rank transformation matrix. The regularization allows us to consider a large number of covariates and we apply the technique to an expanded set of basis called rule ensemble to allow for a more flexible fit. Finally, we illustrate an application of the metric learning problem via a document retrieval example and discuss how similarity-based information can be applied to learn a classifier.</p>
first_indexed	2024-03-07T02:38:12Z
format	Thesis
id	oxford-uuid:a98695a3-0a60-448f-9ec0-63da3c37f7fa
institution	University of Oxford
language	English
last_indexed	2024-03-07T02:38:12Z
publishDate	2014
record_format	dspace
spelling	oxford-uuid:a98695a3-0a60-448f-9ec0-63da3c37f7fa2022-03-27T03:09:02ZSparse distance metric learningThesishttp://purl.org/coar/resource_type/c_db06uuid:a98695a3-0a60-448f-9ec0-63da3c37f7faStatistics (see also social sciences)Pattern recognition (statistics)EnglishOxford University Research Archive - Valet2014Choy, TMeinshausen, N<p>A good distance metric can improve the accuracy of a nearest neighbour classifier. Xing et al. (2002) proposed distance metric learning to find a linear transformation of the data so that observations of different classes are better separated. For high-dimensional problems where many un-informative variables are present, it is attractive to select a sparse distance metric, both to increase predictive accuracy but also to aid interpretation of the result. In this thesis, we investigate three different types of sparsity assumption for distance metric learning and show that sparse recovery is possible under each type of sparsity assumption with an appropriate choice of L1-type penalty. We show that a lasso penalty promotes learning a transformation matrix having lots of zero entries, a group lasso penalty recovers a transformation matrix having zero rows/columns and a trace norm penalty allows us to learn a low rank transformation matrix. The regularization allows us to consider a large number of covariates and we apply the technique to an expanded set of basis called rule ensemble to allow for a more flexible fit. Finally, we illustrate an application of the metric learning problem via a document retrieval example and discuss how similarity-based information can be applied to learn a classifier.</p>
spellingShingle	Statistics (see also social sciences) Pattern recognition (statistics) Choy, T Sparse distance metric learning
title	Sparse distance metric learning
title_full	Sparse distance metric learning
title_fullStr	Sparse distance metric learning
title_full_unstemmed	Sparse distance metric learning
title_short	Sparse distance metric learning
title_sort	sparse distance metric learning
topic	Statistics (see also social sciences) Pattern recognition (statistics)
work_keys_str_mv	AT choyt sparsedistancemetriclearning

Sparse distance metric learning

Similar Items