Gene-regulatory circuitry of disease risk and progression

Complex diseases act heterogeneously through a remarkable diversity of cellular and functional outcomes across the human body, primarily through epigenetic organization and gene regulation. Genetics is a powerful tool to shed light on genes involved in disease, but we need maps of gene regulation an...

Full description

Bibliographic Details
Main Author: Boix, Carles
Other Authors: Kellis, Manolis
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/144638
Description
Summary:Complex diseases act heterogeneously through a remarkable diversity of cellular and functional outcomes across the human body, primarily through epigenetic organization and gene regulation. Genetics is a powerful tool to shed light on genes involved in disease, but we need maps of gene regulation and function at the tissue and cell type resolution to better understand disease mechanisms. Increasingly high resolution measurements of cellular epigenomes and transcriptomes allow us to observe this cellular heterogeneity at scale. To systematically model these, we require scalable statistical tools that can interpret and model gene regulation, its machinery, and complex transcriptional states. In this thesis, I build references of gene regulation and function in health and disease to interpret disease-linked genomic loci and develop methods to learn context-specific representations in order to understand how a single genome yields robust and diverse transcriptional outcomes through modularity of biological functions, how this heterogeneity is maintained, and how it breaks down over time and in disease. In my first project, I build an integrative annotated reference of the human epigenome, systematically integrating multiple annotation projects covering hundreds of human cell lines, tissues, and states. I use this reference to map and dissect non-coding disease loci, specifically map multiple pleiotropic disease loci in coronary artery disease and dissect a locus showing tissue-specific gene involvement. In my second project, I model Alzheimer’s disease (AD) progression across affected brain regions using single-cell transcriptomics, identify specifically vulnerable neuronal populations by brain region along the disease trajectory, and uncover pathways and neuronal circuits that may mediate AD vulnerability. To analyze large-scale transcriptomic references, I develop a fast and scalable method for calling high-resolution gene expression modules from single-cell data, use it to map the complex and modular glial changes underlying AD, and highlight metabolic and immune switches in cognitive impairment. In my third project, I investigate somatic mosaicism as a source of cellular dysfunction and to uncover missing genetic determinants and mechanisms in AD. To do so, I develop methods to map mosaic burden in individual cells jointly with expression and find increased cell type-specific somatic mosaic burden in dementia which I map to specific pathways and genes implicated in AD. Finally, I model neuronal trajectories through neurodegeneration in human brains and mouse models of AD, developing methods for building disease-driven pseudotime trajectories and mapping transcriptomic changes along paths to neuronal senescence driven by DNA damage accumulation.