FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction
Abstract Systematically predicting the effects of mutations on protein fitness is essential for the understanding of genetic diseases. Indeed, predictions complement experimental efforts in analyzing how variants lead to dysfunctional proteins that in turn can cause diseases. Here we present our new...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-04-01
|
Series: | Human Genomics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s40246-024-00605-9 |
_version_ | 1797199326184734720 |
---|---|
author | Matsvei Tsishyn Gabriel Cia Pauline Hermans Jean Kwasigroch Marianne Rooman Fabrizio Pucci |
author_facet | Matsvei Tsishyn Gabriel Cia Pauline Hermans Jean Kwasigroch Marianne Rooman Fabrizio Pucci |
author_sort | Matsvei Tsishyn |
collection | DOAJ |
description | Abstract Systematically predicting the effects of mutations on protein fitness is essential for the understanding of genetic diseases. Indeed, predictions complement experimental efforts in analyzing how variants lead to dysfunctional proteins that in turn can cause diseases. Here we present our new fitness predictor, FiTMuSiC, which leverages structural, evolutionary and coevolutionary information. We show that FiTMuSiC predicts fitness with high accuracy despite the simplicity of its underlying model: it was among the top predictors on the hydroxymethylbilane synthase (HMBS) target of the sixth round of the Critical Assessment of Genome Interpretation challenge (CAGI6) and performs as well as much more complex deep learning models such as AlphaMissense. To further demonstrate FiTMuSiC’s robustness, we compared its predictions with in vitro activity data on HMBS, variant fitness data on human glucokinase (GCK), and variant deleteriousness data on HMBS and GCK. These analyses further confirm FiTMuSiC’s qualities and accuracy, which compare favorably with those of other predictors. Additionally, FiTMuSiC returns two scores that separately describe the functional and structural effects of the variant, thus providing mechanistic insight into why the variant leads to fitness loss or gain. We also provide an easy-to-use webserver at https://babylone.ulb.ac.be/FiTMuSiC , which is freely available for academic use and does not require any bioinformatics expertise, which simplifies the accessibility of our tool for the entire scientific community. |
first_indexed | 2024-04-24T07:13:58Z |
format | Article |
id | doaj.art-4857b80b30d148b5b93c8d5dcbd7d1b8 |
institution | Directory Open Access Journal |
issn | 1479-7364 |
language | English |
last_indexed | 2024-04-24T07:13:58Z |
publishDate | 2024-04-01 |
publisher | BMC |
record_format | Article |
series | Human Genomics |
spelling | doaj.art-4857b80b30d148b5b93c8d5dcbd7d1b82024-04-21T11:24:49ZengBMCHuman Genomics1479-73642024-04-0118111010.1186/s40246-024-00605-9FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness predictionMatsvei Tsishyn0Gabriel Cia1Pauline Hermans2Jean Kwasigroch3Marianne Rooman4Fabrizio Pucci5Computational Biology and Bioinformatics, Université Libre de BruxellesComputational Biology and Bioinformatics, Université Libre de BruxellesComputational Biology and Bioinformatics, Université Libre de BruxellesComputational Biology and Bioinformatics, Université Libre de BruxellesComputational Biology and Bioinformatics, Université Libre de BruxellesComputational Biology and Bioinformatics, Université Libre de BruxellesAbstract Systematically predicting the effects of mutations on protein fitness is essential for the understanding of genetic diseases. Indeed, predictions complement experimental efforts in analyzing how variants lead to dysfunctional proteins that in turn can cause diseases. Here we present our new fitness predictor, FiTMuSiC, which leverages structural, evolutionary and coevolutionary information. We show that FiTMuSiC predicts fitness with high accuracy despite the simplicity of its underlying model: it was among the top predictors on the hydroxymethylbilane synthase (HMBS) target of the sixth round of the Critical Assessment of Genome Interpretation challenge (CAGI6) and performs as well as much more complex deep learning models such as AlphaMissense. To further demonstrate FiTMuSiC’s robustness, we compared its predictions with in vitro activity data on HMBS, variant fitness data on human glucokinase (GCK), and variant deleteriousness data on HMBS and GCK. These analyses further confirm FiTMuSiC’s qualities and accuracy, which compare favorably with those of other predictors. Additionally, FiTMuSiC returns two scores that separately describe the functional and structural effects of the variant, thus providing mechanistic insight into why the variant leads to fitness loss or gain. We also provide an easy-to-use webserver at https://babylone.ulb.ac.be/FiTMuSiC , which is freely available for academic use and does not require any bioinformatics expertise, which simplifies the accessibility of our tool for the entire scientific community.https://doi.org/10.1186/s40246-024-00605-9Protein variants interpretationFitnessCAGI6Pathogenicity |
spellingShingle | Matsvei Tsishyn Gabriel Cia Pauline Hermans Jean Kwasigroch Marianne Rooman Fabrizio Pucci FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction Human Genomics Protein variants interpretation Fitness CAGI6 Pathogenicity |
title | FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction |
title_full | FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction |
title_fullStr | FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction |
title_full_unstemmed | FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction |
title_short | FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction |
title_sort | fitmusic leveraging structural and co evolutionary data for protein fitness prediction |
topic | Protein variants interpretation Fitness CAGI6 Pathogenicity |
url | https://doi.org/10.1186/s40246-024-00605-9 |
work_keys_str_mv | AT matsveitsishyn fitmusicleveragingstructuralandcoevolutionarydataforproteinfitnessprediction AT gabrielcia fitmusicleveragingstructuralandcoevolutionarydataforproteinfitnessprediction AT paulinehermans fitmusicleveragingstructuralandcoevolutionarydataforproteinfitnessprediction AT jeankwasigroch fitmusicleveragingstructuralandcoevolutionarydataforproteinfitnessprediction AT mariannerooman fitmusicleveragingstructuralandcoevolutionarydataforproteinfitnessprediction AT fabriziopucci fitmusicleveragingstructuralandcoevolutionarydataforproteinfitnessprediction |