Statistical aspects of optimal transport

Optimal transport (OT) is a flexible framework for contrasting and interpolating probability measures which has recently been applied throughout science, including in machine learning, statistics, graphics, economics, biology, and more. In this thesis, we study several statistical problems at the fo...

Full description

Bibliographic Details
Main Author: Stromme, Austin J.
Other Authors: Rigollet, Philippe
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/152775
_version_ 1826203960781832192
author Stromme, Austin J.
author2 Rigollet, Philippe
author_facet Rigollet, Philippe
Stromme, Austin J.
author_sort Stromme, Austin J.
collection MIT
description Optimal transport (OT) is a flexible framework for contrasting and interpolating probability measures which has recently been applied throughout science, including in machine learning, statistics, graphics, economics, biology, and more. In this thesis, we study several statistical problems at the forefront of applied optimal transport, prioritizing statistically and computationally practical results. We begin by considering one of the most popular applications of OT in practice, the barycenter problem, providing dimension-free rates of statistical estimation. In the Gaussian case, we analyze first-order methods for computing barycenters, and develop global, dimension-free rates of convergence despite the non-convexity of the problem. Extending beyond the Gaussian case, however, is challenging due to the fundamental curse of dimensionality for OT, which motivates the study of a regularized, and in fact more computationally feasible, form of optimal transport, dubbed entropic optimal transport (entropic OT). Recent work has suggested that entropic OT may escape the curse of dimensionality of un-regularized OT, and in this thesis we develop a refined theory of the statistical behavior of entropic OT by showing that entropic OT does attain truly dimension-free rates of convergence in the large regularization regime, as well as automatically adapts to the intrinsic dimension of the data in the small regularization regime. We also consider the rate of approximation of entropic OT in the semi-discrete case, and complement these results by considering the problem of trajectory reconstruction, proposing two practical methods based off both un-regularized and entropic OT.
first_indexed 2024-09-23T12:46:38Z
format Thesis
id mit-1721.1/152775
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T12:46:38Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1527752023-11-03T03:33:55Z Statistical aspects of optimal transport Stromme, Austin J. Rigollet, Philippe Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Optimal transport (OT) is a flexible framework for contrasting and interpolating probability measures which has recently been applied throughout science, including in machine learning, statistics, graphics, economics, biology, and more. In this thesis, we study several statistical problems at the forefront of applied optimal transport, prioritizing statistically and computationally practical results. We begin by considering one of the most popular applications of OT in practice, the barycenter problem, providing dimension-free rates of statistical estimation. In the Gaussian case, we analyze first-order methods for computing barycenters, and develop global, dimension-free rates of convergence despite the non-convexity of the problem. Extending beyond the Gaussian case, however, is challenging due to the fundamental curse of dimensionality for OT, which motivates the study of a regularized, and in fact more computationally feasible, form of optimal transport, dubbed entropic optimal transport (entropic OT). Recent work has suggested that entropic OT may escape the curse of dimensionality of un-regularized OT, and in this thesis we develop a refined theory of the statistical behavior of entropic OT by showing that entropic OT does attain truly dimension-free rates of convergence in the large regularization regime, as well as automatically adapts to the intrinsic dimension of the data in the small regularization regime. We also consider the rate of approximation of entropic OT in the semi-discrete case, and complement these results by considering the problem of trajectory reconstruction, proposing two practical methods based off both un-regularized and entropic OT. Ph.D. 2023-11-02T20:15:21Z 2023-11-02T20:15:21Z 2023-09 2023-09-21T14:26:18.412Z Thesis https://hdl.handle.net/1721.1/152775 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Stromme, Austin J.
Statistical aspects of optimal transport
title Statistical aspects of optimal transport
title_full Statistical aspects of optimal transport
title_fullStr Statistical aspects of optimal transport
title_full_unstemmed Statistical aspects of optimal transport
title_short Statistical aspects of optimal transport
title_sort statistical aspects of optimal transport
url https://hdl.handle.net/1721.1/152775
work_keys_str_mv AT strommeaustinj statisticalaspectsofoptimaltransport