Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding problem
Construction of a scaffold structure that supports a desired motif, conferring protein function, shows promise for the design of vaccines and enzymes. But a general solution to this motif-scaffolding problem remains open. Current machine-learning techniques for scaffold design are either limited to...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2023
|
Online Access: | https://hdl.handle.net/1721.1/150230 |
_version_ | 1826194922363944960 |
---|---|
author | Yim, Jason |
author2 | Jaakkola, Tommi S. |
author_facet | Jaakkola, Tommi S. Yim, Jason |
author_sort | Yim, Jason |
collection | MIT |
description | Construction of a scaffold structure that supports a desired motif, conferring protein function, shows promise for the design of vaccines and enzymes. But a general solution to this motif-scaffolding problem remains open. Current machine-learning techniques for scaffold design are either limited to unrealistically small scaffolds (up to length 20) or struggle to produce multiple diverse scaffolds. We propose to learn a distribution over diverse and longer protein backbone structures via an E(3)-equivariant graph neural network. We develop SMCDiff to efficiently sample scaffolds from this distribution conditioned on a given motif; our algorithm is the first to theoretically guarantee conditional samples from a diffusion model in the large-compute limit. We evaluate our designed backbones by how well they align with AlphaFold2-predicted structures. We show that our method can (1) sample scaffolds up to 80 residues and (2) achieve structurally diverse scaffolds for a fixed motif. |
first_indexed | 2024-09-23T10:04:15Z |
format | Thesis |
id | mit-1721.1/150230 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T10:04:15Z |
publishDate | 2023 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1502302023-04-01T03:49:32Z Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding problem Yim, Jason Jaakkola, Tommi S. Barzilay, Regina Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Construction of a scaffold structure that supports a desired motif, conferring protein function, shows promise for the design of vaccines and enzymes. But a general solution to this motif-scaffolding problem remains open. Current machine-learning techniques for scaffold design are either limited to unrealistically small scaffolds (up to length 20) or struggle to produce multiple diverse scaffolds. We propose to learn a distribution over diverse and longer protein backbone structures via an E(3)-equivariant graph neural network. We develop SMCDiff to efficiently sample scaffolds from this distribution conditioned on a given motif; our algorithm is the first to theoretically guarantee conditional samples from a diffusion model in the large-compute limit. We evaluate our designed backbones by how well they align with AlphaFold2-predicted structures. We show that our method can (1) sample scaffolds up to 80 residues and (2) achieve structurally diverse scaffolds for a fixed motif. S.M. 2023-03-31T14:41:12Z 2023-03-31T14:41:12Z 2023-02 2023-02-28T14:36:10.371Z Thesis https://hdl.handle.net/1721.1/150230 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Yim, Jason Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding problem |
title | Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding problem |
title_full | Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding problem |
title_fullStr | Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding problem |
title_full_unstemmed | Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding problem |
title_short | Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding problem |
title_sort | diffusion probabilistic modeling of protein backbones in 3d for the motif scaffolding problem |
url | https://hdl.handle.net/1721.1/150230 |
work_keys_str_mv | AT yimjason diffusionprobabilisticmodelingofproteinbackbonesin3dforthemotifscaffoldingproblem |