Tangent space and dimension estimation with the Wasserstein distance

Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this e...

Full description

Bibliographic Details
Main Authors: Lim, U, Oberhauser, H, Nanda, V
Format: Journal article
Language:English
Published: Society for Industrial and Applied Mathematics 2024
_version_ 1811141195715313664
author Lim, U
Oberhauser, H
Nanda, V
author_facet Lim, U
Oberhauser, H
Nanda, V
author_sort Lim, U
collection OXFORD
description Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this estimation is Local PCA, a local version of principal component analysis. Our results accommodate for noisy nonuniform data distribution with the noise that may vary across the manifold, and allow simultaneous estimation at multiple points. Crucially, all of the constants appearing in our bound are explicitly described. The proof uses a matrix concentration inequality to estimate covariance matrices and a Wasserstein distance bound for quantifying nonlinearity of the underlying manifold and nonuniformity of the probability measure.
first_indexed 2024-09-25T04:34:01Z
format Journal article
id oxford-uuid:c54c3e19-ac09-406e-988c-2aeb75e749c7
institution University of Oxford
language English
last_indexed 2024-09-25T04:34:01Z
publishDate 2024
publisher Society for Industrial and Applied Mathematics
record_format dspace
spelling oxford-uuid:c54c3e19-ac09-406e-988c-2aeb75e749c72024-09-06T09:02:45ZTangent space and dimension estimation with the Wasserstein distanceJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:c54c3e19-ac09-406e-988c-2aeb75e749c7EnglishSymplectic ElementsSociety for Industrial and Applied Mathematics2024Lim, UOberhauser, HNanda, VConsider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this estimation is Local PCA, a local version of principal component analysis. Our results accommodate for noisy nonuniform data distribution with the noise that may vary across the manifold, and allow simultaneous estimation at multiple points. Crucially, all of the constants appearing in our bound are explicitly described. The proof uses a matrix concentration inequality to estimate covariance matrices and a Wasserstein distance bound for quantifying nonlinearity of the underlying manifold and nonuniformity of the probability measure.
spellingShingle Lim, U
Oberhauser, H
Nanda, V
Tangent space and dimension estimation with the Wasserstein distance
title Tangent space and dimension estimation with the Wasserstein distance
title_full Tangent space and dimension estimation with the Wasserstein distance
title_fullStr Tangent space and dimension estimation with the Wasserstein distance
title_full_unstemmed Tangent space and dimension estimation with the Wasserstein distance
title_short Tangent space and dimension estimation with the Wasserstein distance
title_sort tangent space and dimension estimation with the wasserstein distance
work_keys_str_mv AT limu tangentspaceanddimensionestimationwiththewassersteindistance
AT oberhauserh tangentspaceanddimensionestimationwiththewassersteindistance
AT nandav tangentspaceanddimensionestimationwiththewassersteindistance