Structure learning in high-dimensional graphical models

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.

Bibliographic Details
Main Author: Wang, Yuhao,Ph.D.Massachusetts Institute of Technology.
Other Authors: Caroline Uhler.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2019
Subjects:
Online Access:https://hdl.handle.net/1721.1/122688
_version_ 1811096260066672640
author Wang, Yuhao,Ph.D.Massachusetts Institute of Technology.
author2 Caroline Uhler.
author_facet Caroline Uhler.
Wang, Yuhao,Ph.D.Massachusetts Institute of Technology.
author_sort Wang, Yuhao,Ph.D.Massachusetts Institute of Technology.
collection MIT
description This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
first_indexed 2024-09-23T16:41:03Z
format Thesis
id mit-1721.1/122688
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T16:41:03Z
publishDate 2019
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1226882019-11-05T03:00:54Z Structure learning in high-dimensional graphical models Wang, Yuhao,Ph.D.Massachusetts Institute of Technology. Caroline Uhler. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Electrical Engineering and Computer Science. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 223-232). In this thesis, we develop efficient and provably consistent algorithms for learning the structure of undirected and directed (causal) graphical models in the high-dimensional setting. Structure learning in graphical models is a central problem in statistics with numerous applications including learning gene regulatory networks from RNA-seq data and learning the dependence structure among stocks from financial time series. Part I of this thesis investigates the problem of learning causal directed acyclic graph (DAG) models from a combination of observational and interventional data. While previous methods considered greedy search algorithms on the space of graphs, we propose to view a DAG as given by a permutation and an undirected graph and instead consider greedy search on the smaller space of permutations. We present the first consistency guarantees of a permutation-based greedy search algorithm based on observational data. In addition, we show that this algorithm naturally extends to the interventional setting, thereby resulting in the first provably consistent algorithm for causal structure discovery from a mix of observational and interventional data. In Part II, we consider causal inference based on heterogeneous observational data collected from naturally perturbed systems. Specifically, we investigate two questions, namely 1) learning the difference between two causal DAGs, and 2) jointly estimating multiple related causal DAGs. To answer question 1), we provide the first provably consistent method for directly estimating the differences in a pair of causal DAGs without separately learning two possibly large and dense DAGs. To answer question 2), we provide a joint estimation procedure based on ℓ0-penalized maximum likelihood estimation and prove that such procedure leads to a faster convergence rate than estimating each DAG separately. Finally, in Part III, we consider the problem of estimating undirected graphical models under distributional constraints. More specifically, we consider a particular form of positive dependence, known as total positivity. Such a constraint is relevant for example for portfolio selection, since assets are often positively dependent. Methods for learning undirected graphical models usually require a particular choice of the tuning parameter for consistent estimation, which is in general unknown a priori and hence a major limitation in applications. We here show that an undirected graphical model under total positivity can be learned consistently without any tuning parameters. The proposed methods are illustrated on various synthetic and real datasets from genomics and finance. Grants from: DARPA (W911NF-16-1-0551), Office of Naval Research Grants (N00014-17-1-2147 and N00014-18-1-2765), MIT-IBM Watson AI Lab Award (W1771646) and NSF Career Grant (DMS-1651995) by Yuhao Wang. Ph. D. Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2019-11-04T19:53:19Z 2019-11-04T19:53:19Z 2019 2019 Thesis https://hdl.handle.net/1721.1/122688 1124762629 eng MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582 232 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Wang, Yuhao,Ph.D.Massachusetts Institute of Technology.
Structure learning in high-dimensional graphical models
title Structure learning in high-dimensional graphical models
title_full Structure learning in high-dimensional graphical models
title_fullStr Structure learning in high-dimensional graphical models
title_full_unstemmed Structure learning in high-dimensional graphical models
title_short Structure learning in high-dimensional graphical models
title_sort structure learning in high dimensional graphical models
topic Electrical Engineering and Computer Science.
url https://hdl.handle.net/1721.1/122688
work_keys_str_mv AT wangyuhaophdmassachusettsinstituteoftechnology structurelearninginhighdimensionalgraphicalmodels