Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation

Across the sciences, social sciences and engineering, applied statisticians seek to build understandings of complex relationships from increasingly large datasets. In statistical genetics, for example, we observe up to millions of genetic variations in each of thousands of individuals, and wish to a...

Full description

Bibliographic Details
Main Author: Trippe, Brian L.
Other Authors: Broderick, Tamara
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/144554
_version_ 1811082632528658432
author Trippe, Brian L.
author2 Broderick, Tamara
author_facet Broderick, Tamara
Trippe, Brian L.
author_sort Trippe, Brian L.
collection MIT
description Across the sciences, social sciences and engineering, applied statisticians seek to build understandings of complex relationships from increasingly large datasets. In statistical genetics, for example, we observe up to millions of genetic variations in each of thousands of individuals, and wish to associate these variations with the development of disease. For ‘high dimensional’ problems like this, the languages of linear modeling and Bayesian statistics appeal because they provide interpretability, coherent uncertainty, and the capacity for information sharing across related datasets. But at the same time, high dimensionality introduces several challenges not solved by existing methodology. This thesis addresses three challenges that arise when applying the Bayesian methodology in high dimensions. A first challenge is how to apply hierarchical modeling, a mainstay of Bayesian inference, to share information between multiple linear models with many covariates (for example, genetic studies of multiple related diseases). The first part of the thesis demonstrates that the default approach to hierarchical linear modeling fails in high dimensions, and presents a new, effective model for this regime. The second part of the thesis addresses the computational challenge presented by Bayesian inference in high dimensions — existing methods demand time that scales super-linearly with the number of covariates. We present two algorithms that permit fast, accurate inferences by leveraging (i) low rank approximations of data or (ii) parallelism across a certain class of Markov chain Monte Carlo algorithms. The final part of the thesis addresses the challenge of evaluation. Modern statistics provides an expansive toolkit for estimating unknown parameters, and a typical Bayesian analysis justifies its estimates through belief in subjective a priori assumptions. We address this by introducing a measure of confidence in the new estimate (the ‘c-value’), that can diagnose the accuracy of a Bayesian estimate without requiring this subjectivism.
first_indexed 2024-09-23T12:06:30Z
format Thesis
id mit-1721.1/144554
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T12:06:30Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1445542022-08-30T03:36:27Z Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation Trippe, Brian L. Broderick, Tamara Massachusetts Institute of Technology. Computational and Systems Biology Program Across the sciences, social sciences and engineering, applied statisticians seek to build understandings of complex relationships from increasingly large datasets. In statistical genetics, for example, we observe up to millions of genetic variations in each of thousands of individuals, and wish to associate these variations with the development of disease. For ‘high dimensional’ problems like this, the languages of linear modeling and Bayesian statistics appeal because they provide interpretability, coherent uncertainty, and the capacity for information sharing across related datasets. But at the same time, high dimensionality introduces several challenges not solved by existing methodology. This thesis addresses three challenges that arise when applying the Bayesian methodology in high dimensions. A first challenge is how to apply hierarchical modeling, a mainstay of Bayesian inference, to share information between multiple linear models with many covariates (for example, genetic studies of multiple related diseases). The first part of the thesis demonstrates that the default approach to hierarchical linear modeling fails in high dimensions, and presents a new, effective model for this regime. The second part of the thesis addresses the computational challenge presented by Bayesian inference in high dimensions — existing methods demand time that scales super-linearly with the number of covariates. We present two algorithms that permit fast, accurate inferences by leveraging (i) low rank approximations of data or (ii) parallelism across a certain class of Markov chain Monte Carlo algorithms. The final part of the thesis addresses the challenge of evaluation. Modern statistics provides an expansive toolkit for estimating unknown parameters, and a typical Bayesian analysis justifies its estimates through belief in subjective a priori assumptions. We address this by introducing a measure of confidence in the new estimate (the ‘c-value’), that can diagnose the accuracy of a Bayesian estimate without requiring this subjectivism. Ph.D. 2022-08-29T15:55:28Z 2022-08-29T15:55:28Z 2022-05 2022-06-14T23:24:56.147Z Thesis https://hdl.handle.net/1721.1/144554 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Trippe, Brian L.
Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation
title Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation
title_full Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation
title_fullStr Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation
title_full_unstemmed Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation
title_short Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation
title_sort bayesian linear modeling in high dimensions advances in hierarchical modeling inference and evaluation
url https://hdl.handle.net/1721.1/144554
work_keys_str_mv AT trippebrianl bayesianlinearmodelinginhighdimensionsadvancesinhierarchicalmodelinginferenceandevaluation