Modeling Molecular Structures with Intrinsic Diffusion Models

Since its foundations, more than one hundred years ago, the field of structural biology has strived to understand and analyze the properties of molecules and their interactions by studying the structure that they take in 3D space. However, a fundamental challenge with this approach has been the dyna...

Full description

Bibliographic Details
Main Author: Corso, Gabriele
Other Authors: Jaakkola, Tommi S.
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/150250
https://orcid.org/0000-0002-1963-8755
_version_ 1811077244441853952
author Corso, Gabriele
author2 Jaakkola, Tommi S.
author_facet Jaakkola, Tommi S.
Corso, Gabriele
author_sort Corso, Gabriele
collection MIT
description Since its foundations, more than one hundred years ago, the field of structural biology has strived to understand and analyze the properties of molecules and their interactions by studying the structure that they take in 3D space. However, a fundamental challenge with this approach has been the dynamic nature of these particles, which forces us to model not a single but a whole distribution of structures for every molecular system. This thesis proposes Intrinsic Diffusion Modeling, a novel approach to this problem based on combining diffusion generative models with scientific knowledge about the flexibility of biological complexes. The knowledge of these degrees of freedom is translated into the definition of a manifold over which the diffusion process is defined. This manifold significantly reduces the dimensionality and increases the smoothness of the generation space allowing for significantly faster and more accurate generative processes. We demonstrate the effectiveness of this approach on two fundamental tasks at the basis of computational chemistry and biology: molecular conformer generation and molecular docking. In both tasks, we construct the first deep learning method to outperform traditional computational approaches achieving an unprecedented level of accuracy for scalable programs.
first_indexed 2024-09-23T10:39:57Z
format Thesis
id mit-1721.1/150250
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T10:39:57Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1502502023-04-01T03:45:33Z Modeling Molecular Structures with Intrinsic Diffusion Models Corso, Gabriele Jaakkola, Tommi S. Barzilay, Regina Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Since its foundations, more than one hundred years ago, the field of structural biology has strived to understand and analyze the properties of molecules and their interactions by studying the structure that they take in 3D space. However, a fundamental challenge with this approach has been the dynamic nature of these particles, which forces us to model not a single but a whole distribution of structures for every molecular system. This thesis proposes Intrinsic Diffusion Modeling, a novel approach to this problem based on combining diffusion generative models with scientific knowledge about the flexibility of biological complexes. The knowledge of these degrees of freedom is translated into the definition of a manifold over which the diffusion process is defined. This manifold significantly reduces the dimensionality and increases the smoothness of the generation space allowing for significantly faster and more accurate generative processes. We demonstrate the effectiveness of this approach on two fundamental tasks at the basis of computational chemistry and biology: molecular conformer generation and molecular docking. In both tasks, we construct the first deep learning method to outperform traditional computational approaches achieving an unprecedented level of accuracy for scalable programs. S.M. 2023-03-31T14:42:43Z 2023-03-31T14:42:43Z 2023-02 2023-02-28T14:36:00.894Z Thesis https://hdl.handle.net/1721.1/150250 https://orcid.org/0000-0002-1963-8755 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Corso, Gabriele
Modeling Molecular Structures with Intrinsic Diffusion Models
title Modeling Molecular Structures with Intrinsic Diffusion Models
title_full Modeling Molecular Structures with Intrinsic Diffusion Models
title_fullStr Modeling Molecular Structures with Intrinsic Diffusion Models
title_full_unstemmed Modeling Molecular Structures with Intrinsic Diffusion Models
title_short Modeling Molecular Structures with Intrinsic Diffusion Models
title_sort modeling molecular structures with intrinsic diffusion models
url https://hdl.handle.net/1721.1/150250
https://orcid.org/0000-0002-1963-8755
work_keys_str_mv AT corsogabriele modelingmolecularstructureswithintrinsicdiffusionmodels