SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability

Modeling the effect of mutations on protein thermodynamics stability is useful for protein engineering and understanding molecular mechanisms of disease-causing variants. Here, we report a new development of the SAAFEC method, the SAAFEC-SEQ, which is a gradient boosting decision tree machine learni...

Full description

Bibliographic Details
Main Authors: Gen Li, Shailesh Kumar Panday, Emil Alexov
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:https://www.mdpi.com/1422-0067/22/2/606
_version_ 1797413961883189248
author Gen Li
Shailesh Kumar Panday
Emil Alexov
author_facet Gen Li
Shailesh Kumar Panday
Emil Alexov
author_sort Gen Li
collection DOAJ
description Modeling the effect of mutations on protein thermodynamics stability is useful for protein engineering and understanding molecular mechanisms of disease-causing variants. Here, we report a new development of the SAAFEC method, the SAAFEC-SEQ, which is a gradient boosting decision tree machine learning method to predict the change of the folding free energy caused by amino acid substitutions. The method does not require the 3D structure of the corresponding protein, but only its sequence and, thus, can be applied on genome-scale investigations where structural information is very sparse. SAAFEC-SEQ uses physicochemical properties, sequence features, and evolutionary information features to make the predictions. It is shown to consistently outperform all existing state-of-the-art sequence-based methods in both the Pearson correlation coefficient and root-mean-squared-error parameters as benchmarked on several independent datasets. The SAAFEC-SEQ has been implemented into a web server and is available as stand-alone code that can be downloaded and embedded into other researchers’ code.
first_indexed 2024-03-09T05:26:00Z
format Article
id doaj.art-22fbd8c5c58444d5afab2fd1ed686158
institution Directory Open Access Journal
issn 1661-6596
1422-0067
language English
last_indexed 2024-03-09T05:26:00Z
publishDate 2021-01-01
publisher MDPI AG
record_format Article
series International Journal of Molecular Sciences
spelling doaj.art-22fbd8c5c58444d5afab2fd1ed6861582023-12-03T12:36:50ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672021-01-0122260610.3390/ijms22020606SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic StabilityGen Li0Shailesh Kumar Panday1Emil Alexov2Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USADepartment of Physics and Astronomy, Clemson University, Clemson, SC 29634, USADepartment of Physics and Astronomy, Clemson University, Clemson, SC 29634, USAModeling the effect of mutations on protein thermodynamics stability is useful for protein engineering and understanding molecular mechanisms of disease-causing variants. Here, we report a new development of the SAAFEC method, the SAAFEC-SEQ, which is a gradient boosting decision tree machine learning method to predict the change of the folding free energy caused by amino acid substitutions. The method does not require the 3D structure of the corresponding protein, but only its sequence and, thus, can be applied on genome-scale investigations where structural information is very sparse. SAAFEC-SEQ uses physicochemical properties, sequence features, and evolutionary information features to make the predictions. It is shown to consistently outperform all existing state-of-the-art sequence-based methods in both the Pearson correlation coefficient and root-mean-squared-error parameters as benchmarked on several independent datasets. The SAAFEC-SEQ has been implemented into a web server and is available as stand-alone code that can be downloaded and embedded into other researchers’ code.https://www.mdpi.com/1422-0067/22/2/606thermodynamics stabilitysingle point mutationsequence-basedmachine learningweb server
spellingShingle Gen Li
Shailesh Kumar Panday
Emil Alexov
SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability
International Journal of Molecular Sciences
thermodynamics stability
single point mutation
sequence-based
machine learning
web server
title SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability
title_full SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability
title_fullStr SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability
title_full_unstemmed SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability
title_short SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability
title_sort saafec seq a sequence based method for predicting the effect of single point mutations on protein thermodynamic stability
topic thermodynamics stability
single point mutation
sequence-based
machine learning
web server
url https://www.mdpi.com/1422-0067/22/2/606
work_keys_str_mv AT genli saafecseqasequencebasedmethodforpredictingtheeffectofsinglepointmutationsonproteinthermodynamicstability
AT shaileshkumarpanday saafecseqasequencebasedmethodforpredictingtheeffectofsinglepointmutationsonproteinthermodynamicstability
AT emilalexov saafecseqasequencebasedmethodforpredictingtheeffectofsinglepointmutationsonproteinthermodynamicstability