Gradient-based dimension reduction for Bayesian inverse problems and simulation-based inference

Inference is a pervasive task in science and engineering applications. The Bayesian approach to inference facilitates informed decision making by quantifying uncertainty in parameters and predictions, but can be computationally demanding. This thesis focuses on Bayesian methods for inverse problems...

Full description

Bibliographic Details
Main Author: Brennan, Michael Cian
Other Authors: Marzouk, Youssef
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/151914
https://orcid.org/0000-0001-7812-9347
_version_ 1811079948644909056
author Brennan, Michael Cian
author2 Marzouk, Youssef
author_facet Marzouk, Youssef
Brennan, Michael Cian
author_sort Brennan, Michael Cian
collection MIT
description Inference is a pervasive task in science and engineering applications. The Bayesian approach to inference facilitates informed decision making by quantifying uncertainty in parameters and predictions, but can be computationally demanding. This thesis focuses on Bayesian methods for inverse problems governed by partial differential equations and for simulation-based (likelihood-free) inference: in both settings, the high dimensionality of model parameters and/or data can render naïve posterior exploration intractable. We address this challenge by developing gradient-based methods that discover and exploit several notions of low-dimensional structure in inference, and then linking these dimension reduction methods to inference algorithms that employ measure transport. Dimension reduction substantially decreases the computational burden of accurate inference in high-dimensional problems, and can also reveal interpretable structure that provides qualitative insights. Our contributions are grouped into three primary topics, as follows. Low-dimensional subspaces First, we propose an iterative framework for solving high-dimensional Bayesian inference problems using transport maps or flows that act only on low-dimensional subspaces. We provide a principled way of identifying such subspaces by minimizing an upper bound on the Kullback—Leibler divergence between the current approximation and the target (posterior) distribution. This approach thus focuses the expressiveness of a transport map along the directions of most significant discrepancy from the target and can be used to greedily build deep compositions of maps—where low-dimensional projections of the parameters are iteratively transformed to match the posterior. We prove weak convergence of the generated sequence of distributions to the posterior and demonstrate the benefits of the framework on an array of challenging high-dimensional inference problems. Low-rank conditional structure Second, we explore the notion of low-rank conditional structure: summarizing conditioning variables with low-dimensional projections. We show how such summaries can be derived by minimizing a tractable gradient-based bound on mutual information, and then develop a framework that uses low-rank conditional structure in the posterior distribution—or in the joint distribution of parameters and observations—to improve approximate inference using measure transport. Our approach exploits the link between component functions of a triangular (Knothe–Rosenblatt) transport map and specific marginal-conditional distributions. Rather than approximating the target distribution globally, as in many current methods, we discover low-dimensional structure in each of these marginalconditional distributions separately and assemble the results into a naturally sparse triangular transport map. We evaluate our approach on two nonlinear Bayesian inverse problems involving elliptic partial differential equations (steady-state Darcy flow and the Helmholtz equation), in an amortized inference setting. Score-ratio matching Both of the preceding contributions rely on diagnostic matrices built from evaluations of gradients of the posterior or joint log-density. Our final thrust broadens the applicability of gradient-based dimension reduction to problems where such gradients are not available. We modify score matching methods to estimate score ratios that enable our gradient-based diagnostic matrices to be computed more effectively. In particular, we propose a tailored score-network parameterization and a regularization method that exploit the presence of the low-dimensional structure we seek. We demonstrate the effectiveness of the proposed method on inference problems related to groundwater modeling and energy market modeling.
first_indexed 2024-09-23T11:23:01Z
format Thesis
id mit-1721.1/151914
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T11:23:01Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1519142023-08-24T03:35:41Z Gradient-based dimension reduction for Bayesian inverse problems and simulation-based inference Brennan, Michael Cian Marzouk, Youssef Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Massachusetts Institute of Technology. Center for Computational Science and Engineering Inference is a pervasive task in science and engineering applications. The Bayesian approach to inference facilitates informed decision making by quantifying uncertainty in parameters and predictions, but can be computationally demanding. This thesis focuses on Bayesian methods for inverse problems governed by partial differential equations and for simulation-based (likelihood-free) inference: in both settings, the high dimensionality of model parameters and/or data can render naïve posterior exploration intractable. We address this challenge by developing gradient-based methods that discover and exploit several notions of low-dimensional structure in inference, and then linking these dimension reduction methods to inference algorithms that employ measure transport. Dimension reduction substantially decreases the computational burden of accurate inference in high-dimensional problems, and can also reveal interpretable structure that provides qualitative insights. Our contributions are grouped into three primary topics, as follows. Low-dimensional subspaces First, we propose an iterative framework for solving high-dimensional Bayesian inference problems using transport maps or flows that act only on low-dimensional subspaces. We provide a principled way of identifying such subspaces by minimizing an upper bound on the Kullback—Leibler divergence between the current approximation and the target (posterior) distribution. This approach thus focuses the expressiveness of a transport map along the directions of most significant discrepancy from the target and can be used to greedily build deep compositions of maps—where low-dimensional projections of the parameters are iteratively transformed to match the posterior. We prove weak convergence of the generated sequence of distributions to the posterior and demonstrate the benefits of the framework on an array of challenging high-dimensional inference problems. Low-rank conditional structure Second, we explore the notion of low-rank conditional structure: summarizing conditioning variables with low-dimensional projections. We show how such summaries can be derived by minimizing a tractable gradient-based bound on mutual information, and then develop a framework that uses low-rank conditional structure in the posterior distribution—or in the joint distribution of parameters and observations—to improve approximate inference using measure transport. Our approach exploits the link between component functions of a triangular (Knothe–Rosenblatt) transport map and specific marginal-conditional distributions. Rather than approximating the target distribution globally, as in many current methods, we discover low-dimensional structure in each of these marginalconditional distributions separately and assemble the results into a naturally sparse triangular transport map. We evaluate our approach on two nonlinear Bayesian inverse problems involving elliptic partial differential equations (steady-state Darcy flow and the Helmholtz equation), in an amortized inference setting. Score-ratio matching Both of the preceding contributions rely on diagnostic matrices built from evaluations of gradients of the posterior or joint log-density. Our final thrust broadens the applicability of gradient-based dimension reduction to problems where such gradients are not available. We modify score matching methods to estimate score ratios that enable our gradient-based diagnostic matrices to be computed more effectively. In particular, we propose a tailored score-network parameterization and a regularization method that exploit the presence of the low-dimensional structure we seek. We demonstrate the effectiveness of the proposed method on inference problems related to groundwater modeling and energy market modeling. Ph.D. 2023-08-23T16:18:45Z 2023-08-23T16:18:45Z 2023-06 2023-06-16T11:26:25.893Z Thesis https://hdl.handle.net/1721.1/151914 https://orcid.org/0000-0001-7812-9347 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Brennan, Michael Cian
Gradient-based dimension reduction for Bayesian inverse problems and simulation-based inference
title Gradient-based dimension reduction for Bayesian inverse problems and simulation-based inference
title_full Gradient-based dimension reduction for Bayesian inverse problems and simulation-based inference
title_fullStr Gradient-based dimension reduction for Bayesian inverse problems and simulation-based inference
title_full_unstemmed Gradient-based dimension reduction for Bayesian inverse problems and simulation-based inference
title_short Gradient-based dimension reduction for Bayesian inverse problems and simulation-based inference
title_sort gradient based dimension reduction for bayesian inverse problems and simulation based inference
url https://hdl.handle.net/1721.1/151914
https://orcid.org/0000-0001-7812-9347
work_keys_str_mv AT brennanmichaelcian gradientbaseddimensionreductionforbayesianinverseproblemsandsimulationbasedinference