Tangent space and dimension estimation with the Wasserstein distance

Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this e...

Full description

Bibliographic Details
Main Authors: Lim, U, Oberhauser, H, Nanda, V
Format: Internet publication
Language:English
Published: 2021
_version_ 1811139278587035648
author Lim, U
Oberhauser, H
Nanda, V
author_facet Lim, U
Oberhauser, H
Nanda, V
author_sort Lim, U
collection OXFORD
description Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this estimation is Local PCA, a local version of principal component analysis. Our results accommodate for noisy non-uniform data distribution with the noise that may vary across the manifold, and allow simultaneous estimation at multiple points. Crucially, all of the constants appearing in our bound are explicitly described. The proof uses a matrix concentration inequality to estimate covariance matrices and a Wasserstein distance bound for quantifying nonlinearity of the underlying manifold and non-uniformity of the probability measure.
first_indexed 2024-03-07T07:45:27Z
format Internet publication
id oxford-uuid:6aec487a-847d-4f58-8671-81facd82c3df
institution University of Oxford
language English
last_indexed 2024-09-25T04:03:33Z
publishDate 2021
record_format dspace
spelling oxford-uuid:6aec487a-847d-4f58-8671-81facd82c3df2024-05-13T14:43:53ZTangent space and dimension estimation with the Wasserstein distanceInternet publicationhttp://purl.org/coar/resource_type/c_7ad9uuid:6aec487a-847d-4f58-8671-81facd82c3dfEnglishSymplectic Elements2021Lim, UOberhauser, HNanda, VConsider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this estimation is Local PCA, a local version of principal component analysis. Our results accommodate for noisy non-uniform data distribution with the noise that may vary across the manifold, and allow simultaneous estimation at multiple points. Crucially, all of the constants appearing in our bound are explicitly described. The proof uses a matrix concentration inequality to estimate covariance matrices and a Wasserstein distance bound for quantifying nonlinearity of the underlying manifold and non-uniformity of the probability measure.
spellingShingle Lim, U
Oberhauser, H
Nanda, V
Tangent space and dimension estimation with the Wasserstein distance
title Tangent space and dimension estimation with the Wasserstein distance
title_full Tangent space and dimension estimation with the Wasserstein distance
title_fullStr Tangent space and dimension estimation with the Wasserstein distance
title_full_unstemmed Tangent space and dimension estimation with the Wasserstein distance
title_short Tangent space and dimension estimation with the Wasserstein distance
title_sort tangent space and dimension estimation with the wasserstein distance
work_keys_str_mv AT limu tangentspaceanddimensionestimationwiththewassersteindistance
AT oberhauserh tangentspaceanddimensionestimationwiththewassersteindistance
AT nandav tangentspaceanddimensionestimationwiththewassersteindistance