Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stability
Abstract Background Computational methods of predicting protein stability changes upon missense mutations are invaluable tools in high-throughput studies involving a large number of protein variants. However, they are limited by a wide variation in accuracy and difficulty of assessing prediction unc...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2023-11-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-023-05537-0 |
_version_ | 1827707929840582656 |
---|---|
author | Yesol Sapozhnikov Jagdish Suresh Patel F. Marty Ytreberg Craig R. Miller |
author_facet | Yesol Sapozhnikov Jagdish Suresh Patel F. Marty Ytreberg Craig R. Miller |
author_sort | Yesol Sapozhnikov |
collection | DOAJ |
description | Abstract Background Computational methods of predicting protein stability changes upon missense mutations are invaluable tools in high-throughput studies involving a large number of protein variants. However, they are limited by a wide variation in accuracy and difficulty of assessing prediction uncertainty. Using a popular computational tool, FoldX, we develop a statistical framework that quantifies the uncertainty of predicted changes in protein stability. Results We show that multiple linear regression models can be used to quantify the uncertainty associated with FoldX prediction for individual mutations. Comparing the performance among models with varying degrees of complexity, we find that the model precision improves significantly when we utilize molecular dynamics simulation as part of the FoldX workflow. Based on the model that incorporates information from molecular dynamics, biochemical properties, as well as FoldX energy terms, we can generally expect upper bounds on the uncertainty of folding stability predictions of ± 2.9 kcal/mol and ± 3.5 kcal/mol for binding stability predictions. The uncertainty for individual mutations varies; our model estimates it using FoldX energy terms, biochemical properties of the mutated residue, as well as the variability among snapshots from molecular dynamics simulation. Conclusions Using a linear regression framework, we construct models to predict the uncertainty associated with FoldX prediction of stability changes upon mutation. This technique is straightforward and can be extended to other computational methods as well. |
first_indexed | 2024-03-10T16:56:32Z |
format | Article |
id | doaj.art-114f10095816436fb96fe61190b50749 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-03-10T16:56:32Z |
publishDate | 2023-11-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-114f10095816436fb96fe61190b507492023-11-20T11:06:27ZengBMCBMC Bioinformatics1471-21052023-11-0124111810.1186/s12859-023-05537-0Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stabilityYesol Sapozhnikov0Jagdish Suresh Patel1F. Marty Ytreberg2Craig R. Miller3Program in Bioinformatics and Computational Biology, University of IdahoDepartment of Chemical and Biological Engineering, University of IdahoDepartment of Physics, University of IdahoDepartment of Biological Sciences, University of IdahoAbstract Background Computational methods of predicting protein stability changes upon missense mutations are invaluable tools in high-throughput studies involving a large number of protein variants. However, they are limited by a wide variation in accuracy and difficulty of assessing prediction uncertainty. Using a popular computational tool, FoldX, we develop a statistical framework that quantifies the uncertainty of predicted changes in protein stability. Results We show that multiple linear regression models can be used to quantify the uncertainty associated with FoldX prediction for individual mutations. Comparing the performance among models with varying degrees of complexity, we find that the model precision improves significantly when we utilize molecular dynamics simulation as part of the FoldX workflow. Based on the model that incorporates information from molecular dynamics, biochemical properties, as well as FoldX energy terms, we can generally expect upper bounds on the uncertainty of folding stability predictions of ± 2.9 kcal/mol and ± 3.5 kcal/mol for binding stability predictions. The uncertainty for individual mutations varies; our model estimates it using FoldX energy terms, biochemical properties of the mutated residue, as well as the variability among snapshots from molecular dynamics simulation. Conclusions Using a linear regression framework, we construct models to predict the uncertainty associated with FoldX prediction of stability changes upon mutation. This technique is straightforward and can be extended to other computational methods as well.https://doi.org/10.1186/s12859-023-05537-0Protein stabilityProtein mutationsStability predictionError predictionStatistical model |
spellingShingle | Yesol Sapozhnikov Jagdish Suresh Patel F. Marty Ytreberg Craig R. Miller Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stability BMC Bioinformatics Protein stability Protein mutations Stability prediction Error prediction Statistical model |
title | Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stability |
title_full | Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stability |
title_fullStr | Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stability |
title_full_unstemmed | Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stability |
title_short | Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stability |
title_sort | statistical modeling to quantify the uncertainty of foldx predicted protein folding and binding stability |
topic | Protein stability Protein mutations Stability prediction Error prediction Statistical model |
url | https://doi.org/10.1186/s12859-023-05537-0 |
work_keys_str_mv | AT yesolsapozhnikov statisticalmodelingtoquantifytheuncertaintyoffoldxpredictedproteinfoldingandbindingstability AT jagdishsureshpatel statisticalmodelingtoquantifytheuncertaintyoffoldxpredictedproteinfoldingandbindingstability AT fmartyytreberg statisticalmodelingtoquantifytheuncertaintyoffoldxpredictedproteinfoldingandbindingstability AT craigrmiller statisticalmodelingtoquantifytheuncertaintyoffoldxpredictedproteinfoldingandbindingstability |