From genetics to disease: Algorithms to decode somatic mutations

A long-standing goal of biology is to understand how the 3 billion bases of DNA in each human cell contribute to molecular, cellular, and, ultimately, organism function. Somatic mutations, which arise in cells during the course of life, are natural experiments that can be leveraged to provide insigh...

Full description

Bibliographic Details
Main Author: Sherman, Maxwell A.
Other Authors: Berger, Bonnie
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/150068
_version_ 1811089245777952768
author Sherman, Maxwell A.
author2 Berger, Bonnie
author_facet Berger, Bonnie
Sherman, Maxwell A.
author_sort Sherman, Maxwell A.
collection MIT
description A long-standing goal of biology is to understand how the 3 billion bases of DNA in each human cell contribute to molecular, cellular, and, ultimately, organism function. Somatic mutations, which arise in cells during the course of life, are natural experiments that can be leveraged to provide insight into this profound question. This thesis develops computational methods to identify somatic mutations and infer their phenotypic relationships from population-scale genome sequencing. The methods are developed and applied in the context of two human diseases, autism spectrum disorder and cancer. First, we develop a suite of computational tools to detect somatic copy number variants that likely arose during early embryonic development. We apply this tool set to establish that such CNVs contribute substantially to the risk of developing autism spectrum disorder in a small number of carriers. We next develop a general purpose method for modeling discrete stochastic processes at multiple resolutions. We demonstrate the utility of the method by modeling patterns of somatic mutations across the cancer genome. We finally extend and apply the aforementioned method to map somatic mutation rates in 37 types of cancer and identify sets of mutations that likely drive cancer growth in both coding and noncoding regions of the genome. Broadly, this work demonstrates how the unique challenges of biological data can both inform and benefit from computational research.
first_indexed 2024-09-23T14:16:08Z
format Thesis
id mit-1721.1/150068
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:16:08Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1500682023-04-01T03:20:30Z From genetics to disease: Algorithms to decode somatic mutations Sherman, Maxwell A. Berger, Bonnie Loh, Po-Ru Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science A long-standing goal of biology is to understand how the 3 billion bases of DNA in each human cell contribute to molecular, cellular, and, ultimately, organism function. Somatic mutations, which arise in cells during the course of life, are natural experiments that can be leveraged to provide insight into this profound question. This thesis develops computational methods to identify somatic mutations and infer their phenotypic relationships from population-scale genome sequencing. The methods are developed and applied in the context of two human diseases, autism spectrum disorder and cancer. First, we develop a suite of computational tools to detect somatic copy number variants that likely arose during early embryonic development. We apply this tool set to establish that such CNVs contribute substantially to the risk of developing autism spectrum disorder in a small number of carriers. We next develop a general purpose method for modeling discrete stochastic processes at multiple resolutions. We demonstrate the utility of the method by modeling patterns of somatic mutations across the cancer genome. We finally extend and apply the aforementioned method to map somatic mutation rates in 37 types of cancer and identify sets of mutations that likely drive cancer growth in both coding and noncoding regions of the genome. Broadly, this work demonstrates how the unique challenges of biological data can both inform and benefit from computational research. Ph.D. 2023-03-31T14:29:38Z 2023-03-31T14:29:38Z 2023-02 2023-02-28T14:39:38.653Z Thesis https://hdl.handle.net/1721.1/150068 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Sherman, Maxwell A.
From genetics to disease: Algorithms to decode somatic mutations
title From genetics to disease: Algorithms to decode somatic mutations
title_full From genetics to disease: Algorithms to decode somatic mutations
title_fullStr From genetics to disease: Algorithms to decode somatic mutations
title_full_unstemmed From genetics to disease: Algorithms to decode somatic mutations
title_short From genetics to disease: Algorithms to decode somatic mutations
title_sort from genetics to disease algorithms to decode somatic mutations
url https://hdl.handle.net/1721.1/150068
work_keys_str_mv AT shermanmaxwella fromgeneticstodiseasealgorithmstodecodesomaticmutations