Tangent space and dimension estimation with the Wasserstein distance
Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this e...
Main Authors: | , , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
Society for Industrial and Applied Mathematics
2024
|
_version_ | 1811141195715313664 |
---|---|
author | Lim, U Oberhauser, H Nanda, V |
author_facet | Lim, U Oberhauser, H Nanda, V |
author_sort | Lim, U |
collection | OXFORD |
description | Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this estimation is Local PCA, a local version of principal component analysis. Our results accommodate for noisy nonuniform data distribution with the noise that may vary across the manifold, and allow simultaneous estimation at multiple points. Crucially, all of the constants appearing in our bound are explicitly described. The proof uses a matrix concentration inequality to estimate covariance matrices and a Wasserstein distance bound for quantifying nonlinearity of the underlying manifold and nonuniformity of the probability measure. |
first_indexed | 2024-09-25T04:34:01Z |
format | Journal article |
id | oxford-uuid:c54c3e19-ac09-406e-988c-2aeb75e749c7 |
institution | University of Oxford |
language | English |
last_indexed | 2024-09-25T04:34:01Z |
publishDate | 2024 |
publisher | Society for Industrial and Applied Mathematics |
record_format | dspace |
spelling | oxford-uuid:c54c3e19-ac09-406e-988c-2aeb75e749c72024-09-06T09:02:45ZTangent space and dimension estimation with the Wasserstein distanceJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:c54c3e19-ac09-406e-988c-2aeb75e749c7EnglishSymplectic ElementsSociety for Industrial and Applied Mathematics2024Lim, UOberhauser, HNanda, VConsider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this estimation is Local PCA, a local version of principal component analysis. Our results accommodate for noisy nonuniform data distribution with the noise that may vary across the manifold, and allow simultaneous estimation at multiple points. Crucially, all of the constants appearing in our bound are explicitly described. The proof uses a matrix concentration inequality to estimate covariance matrices and a Wasserstein distance bound for quantifying nonlinearity of the underlying manifold and nonuniformity of the probability measure. |
spellingShingle | Lim, U Oberhauser, H Nanda, V Tangent space and dimension estimation with the Wasserstein distance |
title | Tangent space and dimension estimation with the Wasserstein distance |
title_full | Tangent space and dimension estimation with the Wasserstein distance |
title_fullStr | Tangent space and dimension estimation with the Wasserstein distance |
title_full_unstemmed | Tangent space and dimension estimation with the Wasserstein distance |
title_short | Tangent space and dimension estimation with the Wasserstein distance |
title_sort | tangent space and dimension estimation with the wasserstein distance |
work_keys_str_mv | AT limu tangentspaceanddimensionestimationwiththewassersteindistance AT oberhauserh tangentspaceanddimensionestimationwiththewassersteindistance AT nandav tangentspaceanddimensionestimationwiththewassersteindistance |