Data-driven Mechanistic Modeling of 3D Human Genome

Three-dimensional (3D) organization of the human genome regulates DNA-templated processes, including gene transcription, gene regulation, and DNA replication, which are crucial for cell differentiation and cell functionality. Computational modeling serves as an efficient and effective way of buildin...

Full description

Bibliographic Details
Main Author: Qi, Yifeng
Other Authors: Zhang, Bin
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/143360
_version_ 1826189033091366912
author Qi, Yifeng
author2 Zhang, Bin
author_facet Zhang, Bin
Qi, Yifeng
author_sort Qi, Yifeng
collection MIT
description Three-dimensional (3D) organization of the human genome regulates DNA-templated processes, including gene transcription, gene regulation, and DNA replication, which are crucial for cell differentiation and cell functionality. Computational modeling serves as an efficient and effective way of building high-resolution 3D genome structures and improving our understanding of these molecular processes. My PhD research has been focused on the development of a data-driven, mechanistic modeling framework aiming to better understand the physical principles of how genome organizes as well as the mechanisms of genome structure-coupled biological processes, such as the coalescence of nuclear bodies. This thesis is organized as follows. In the first chapter, we introduce a computational model to simulate chromatin structure and dynamics. The model defines chromatin states by taking one-dimensional genomics and epigenomics data as input and quantitatively learns interacting patterns between these states using experimental contact data. Once learned, the model is able to make de novo predictions of 3D chromatin structures at five-kilo-base resolution across different cell types. The manuscript associated with this study is published in PLoS Computational Biology, 15.6, e1007024 (2019). In the second chapter, we expand the spatial scale of the model to study the organization of the global diploid human genome in the entire nucleus. It has both data-driven and mechanistic nature, as the energy function is explicitly written out based on biologically motivated hypotheses, and all parameters are quantitatively derived from experimental contact data. The model has shown its usefulness both in reconstructing whole-genome structures and in exploring the physical and biological principles of genome organization. The manuscript associated with this study is published in Biophysical Journal, 119, 1905 (2020). In the third chapter, we further apply the data-driven modeling framework that we have developed to study the thermodynamics and kinetics of the formation and coalescence of nuclear bodies. Our study suggests that protein-chromatin interactions facilitate the nucleation of droplets, but hinder their coarsening due to the correlated motion between droplets and the chromatin network: as droplets coalesce, the chromatin network becomes increasingly constrained and is entropically unfavorable. Therefore, protein-chromatin interactions arrest phase separation in multi-droplet states and may drive the variation of nuclear body numbers across cell types. The manuscript associated with this study is published in Nature Communications, 12, 1 (2021).
first_indexed 2024-09-23T08:08:48Z
format Thesis
id mit-1721.1/143360
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T08:08:48Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1433602022-06-16T03:58:15Z Data-driven Mechanistic Modeling of 3D Human Genome Qi, Yifeng Zhang, Bin Massachusetts Institute of Technology. Department of Chemistry Three-dimensional (3D) organization of the human genome regulates DNA-templated processes, including gene transcription, gene regulation, and DNA replication, which are crucial for cell differentiation and cell functionality. Computational modeling serves as an efficient and effective way of building high-resolution 3D genome structures and improving our understanding of these molecular processes. My PhD research has been focused on the development of a data-driven, mechanistic modeling framework aiming to better understand the physical principles of how genome organizes as well as the mechanisms of genome structure-coupled biological processes, such as the coalescence of nuclear bodies. This thesis is organized as follows. In the first chapter, we introduce a computational model to simulate chromatin structure and dynamics. The model defines chromatin states by taking one-dimensional genomics and epigenomics data as input and quantitatively learns interacting patterns between these states using experimental contact data. Once learned, the model is able to make de novo predictions of 3D chromatin structures at five-kilo-base resolution across different cell types. The manuscript associated with this study is published in PLoS Computational Biology, 15.6, e1007024 (2019). In the second chapter, we expand the spatial scale of the model to study the organization of the global diploid human genome in the entire nucleus. It has both data-driven and mechanistic nature, as the energy function is explicitly written out based on biologically motivated hypotheses, and all parameters are quantitatively derived from experimental contact data. The model has shown its usefulness both in reconstructing whole-genome structures and in exploring the physical and biological principles of genome organization. The manuscript associated with this study is published in Biophysical Journal, 119, 1905 (2020). In the third chapter, we further apply the data-driven modeling framework that we have developed to study the thermodynamics and kinetics of the formation and coalescence of nuclear bodies. Our study suggests that protein-chromatin interactions facilitate the nucleation of droplets, but hinder their coarsening due to the correlated motion between droplets and the chromatin network: as droplets coalesce, the chromatin network becomes increasingly constrained and is entropically unfavorable. Therefore, protein-chromatin interactions arrest phase separation in multi-droplet states and may drive the variation of nuclear body numbers across cell types. The manuscript associated with this study is published in Nature Communications, 12, 1 (2021). Ph.D. 2022-06-15T13:15:11Z 2022-06-15T13:15:11Z 2022-02 2022-03-03T18:35:03.097Z Thesis https://hdl.handle.net/1721.1/143360 0000-0002-1511-5818 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Qi, Yifeng
Data-driven Mechanistic Modeling of 3D Human Genome
title Data-driven Mechanistic Modeling of 3D Human Genome
title_full Data-driven Mechanistic Modeling of 3D Human Genome
title_fullStr Data-driven Mechanistic Modeling of 3D Human Genome
title_full_unstemmed Data-driven Mechanistic Modeling of 3D Human Genome
title_short Data-driven Mechanistic Modeling of 3D Human Genome
title_sort data driven mechanistic modeling of 3d human genome
url https://hdl.handle.net/1721.1/143360
work_keys_str_mv AT qiyifeng datadrivenmechanisticmodelingof3dhumangenome