Protein design using structure-based residue preferences

Abstract Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues—without accounting for...

Full description

Bibliographic Details
Main Authors: David Ding, Ada Y. Shaw, Sam Sinai, Nathan Rollins, Noam Prywes, David F. Savage, Michael T. Laub, Debora S. Marks
Format: Article
Language:English
Published: Nature Portfolio 2024-02-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-024-45621-4
_version_ 1827326852288479232
author David Ding
Ada Y. Shaw
Sam Sinai
Nathan Rollins
Noam Prywes
David F. Savage
Michael T. Laub
Debora S. Marks
author_facet David Ding
Ada Y. Shaw
Sam Sinai
Nathan Rollins
Noam Prywes
David F. Savage
Michael T. Laub
Debora S. Marks
author_sort David Ding
collection DOAJ
description Abstract Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues—without accounting for mutation interactions—explain much and sometimes virtually all of the combinatorial mutation effects across 8 datasets (R2 ~ 78-98%). Hence, few observations (~100 times the number of mutated residues) enable accurate prediction of held-out variant effects (Pearson r > 0.80). We hypothesized that the local structural contexts around a residue could be sufficient to predict mutation preferences, and develop an unsupervised approach termed CoVES (Combinatorial Variant Effects from Structure). Our results suggest that CoVES outperforms not just model-free methods but also similarly to complex models for creating functional and diverse protein variants. CoVES offers an effective alternative to complicated models for identifying functional protein mutations.
first_indexed 2024-03-07T14:51:27Z
format Article
id doaj.art-5b7c1d3e4ad64c33ae3f2f2b7e909e0d
institution Directory Open Access Journal
issn 2041-1723
language English
last_indexed 2024-03-07T14:51:27Z
publishDate 2024-02-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj.art-5b7c1d3e4ad64c33ae3f2f2b7e909e0d2024-03-05T19:41:11ZengNature PortfolioNature Communications2041-17232024-02-0115111210.1038/s41467-024-45621-4Protein design using structure-based residue preferencesDavid Ding0Ada Y. Shaw1Sam Sinai2Nathan Rollins3Noam Prywes4David F. Savage5Michael T. Laub6Debora S. Marks7Innovative Genomics Institute, University of CaliforniaDepartment of Systems Biology, Harvard Medical SchoolDyno TherapeuticsSeismic Therapeutics, Lab CentralInnovative Genomics Institute, University of CaliforniaInnovative Genomics Institute, University of CaliforniaDepartment of Biology, Massachusetts Institute of TechnologyDepartment of Systems Biology, Harvard Medical SchoolAbstract Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues—without accounting for mutation interactions—explain much and sometimes virtually all of the combinatorial mutation effects across 8 datasets (R2 ~ 78-98%). Hence, few observations (~100 times the number of mutated residues) enable accurate prediction of held-out variant effects (Pearson r > 0.80). We hypothesized that the local structural contexts around a residue could be sufficient to predict mutation preferences, and develop an unsupervised approach termed CoVES (Combinatorial Variant Effects from Structure). Our results suggest that CoVES outperforms not just model-free methods but also similarly to complex models for creating functional and diverse protein variants. CoVES offers an effective alternative to complicated models for identifying functional protein mutations.https://doi.org/10.1038/s41467-024-45621-4
spellingShingle David Ding
Ada Y. Shaw
Sam Sinai
Nathan Rollins
Noam Prywes
David F. Savage
Michael T. Laub
Debora S. Marks
Protein design using structure-based residue preferences
Nature Communications
title Protein design using structure-based residue preferences
title_full Protein design using structure-based residue preferences
title_fullStr Protein design using structure-based residue preferences
title_full_unstemmed Protein design using structure-based residue preferences
title_short Protein design using structure-based residue preferences
title_sort protein design using structure based residue preferences
url https://doi.org/10.1038/s41467-024-45621-4
work_keys_str_mv AT davidding proteindesignusingstructurebasedresiduepreferences
AT adayshaw proteindesignusingstructurebasedresiduepreferences
AT samsinai proteindesignusingstructurebasedresiduepreferences
AT nathanrollins proteindesignusingstructurebasedresiduepreferences
AT noamprywes proteindesignusingstructurebasedresiduepreferences
AT davidfsavage proteindesignusingstructurebasedresiduepreferences
AT michaeltlaub proteindesignusingstructurebasedresiduepreferences
AT deborasmarks proteindesignusingstructurebasedresiduepreferences