Protein design using structure-based residue preferences
Abstract Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues—without accounting for...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-02-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-024-45621-4 |
_version_ | 1827326852288479232 |
---|---|
author | David Ding Ada Y. Shaw Sam Sinai Nathan Rollins Noam Prywes David F. Savage Michael T. Laub Debora S. Marks |
author_facet | David Ding Ada Y. Shaw Sam Sinai Nathan Rollins Noam Prywes David F. Savage Michael T. Laub Debora S. Marks |
author_sort | David Ding |
collection | DOAJ |
description | Abstract Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues—without accounting for mutation interactions—explain much and sometimes virtually all of the combinatorial mutation effects across 8 datasets (R2 ~ 78-98%). Hence, few observations (~100 times the number of mutated residues) enable accurate prediction of held-out variant effects (Pearson r > 0.80). We hypothesized that the local structural contexts around a residue could be sufficient to predict mutation preferences, and develop an unsupervised approach termed CoVES (Combinatorial Variant Effects from Structure). Our results suggest that CoVES outperforms not just model-free methods but also similarly to complex models for creating functional and diverse protein variants. CoVES offers an effective alternative to complicated models for identifying functional protein mutations. |
first_indexed | 2024-03-07T14:51:27Z |
format | Article |
id | doaj.art-5b7c1d3e4ad64c33ae3f2f2b7e909e0d |
institution | Directory Open Access Journal |
issn | 2041-1723 |
language | English |
last_indexed | 2024-03-07T14:51:27Z |
publishDate | 2024-02-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj.art-5b7c1d3e4ad64c33ae3f2f2b7e909e0d2024-03-05T19:41:11ZengNature PortfolioNature Communications2041-17232024-02-0115111210.1038/s41467-024-45621-4Protein design using structure-based residue preferencesDavid Ding0Ada Y. Shaw1Sam Sinai2Nathan Rollins3Noam Prywes4David F. Savage5Michael T. Laub6Debora S. Marks7Innovative Genomics Institute, University of CaliforniaDepartment of Systems Biology, Harvard Medical SchoolDyno TherapeuticsSeismic Therapeutics, Lab CentralInnovative Genomics Institute, University of CaliforniaInnovative Genomics Institute, University of CaliforniaDepartment of Biology, Massachusetts Institute of TechnologyDepartment of Systems Biology, Harvard Medical SchoolAbstract Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues—without accounting for mutation interactions—explain much and sometimes virtually all of the combinatorial mutation effects across 8 datasets (R2 ~ 78-98%). Hence, few observations (~100 times the number of mutated residues) enable accurate prediction of held-out variant effects (Pearson r > 0.80). We hypothesized that the local structural contexts around a residue could be sufficient to predict mutation preferences, and develop an unsupervised approach termed CoVES (Combinatorial Variant Effects from Structure). Our results suggest that CoVES outperforms not just model-free methods but also similarly to complex models for creating functional and diverse protein variants. CoVES offers an effective alternative to complicated models for identifying functional protein mutations.https://doi.org/10.1038/s41467-024-45621-4 |
spellingShingle | David Ding Ada Y. Shaw Sam Sinai Nathan Rollins Noam Prywes David F. Savage Michael T. Laub Debora S. Marks Protein design using structure-based residue preferences Nature Communications |
title | Protein design using structure-based residue preferences |
title_full | Protein design using structure-based residue preferences |
title_fullStr | Protein design using structure-based residue preferences |
title_full_unstemmed | Protein design using structure-based residue preferences |
title_short | Protein design using structure-based residue preferences |
title_sort | protein design using structure based residue preferences |
url | https://doi.org/10.1038/s41467-024-45621-4 |
work_keys_str_mv | AT davidding proteindesignusingstructurebasedresiduepreferences AT adayshaw proteindesignusingstructurebasedresiduepreferences AT samsinai proteindesignusingstructurebasedresiduepreferences AT nathanrollins proteindesignusingstructurebasedresiduepreferences AT noamprywes proteindesignusingstructurebasedresiduepreferences AT davidfsavage proteindesignusingstructurebasedresiduepreferences AT michaeltlaub proteindesignusingstructurebasedresiduepreferences AT deborasmarks proteindesignusingstructurebasedresiduepreferences |