An integrated protein structure fitness scoring approach for identifying native-like model structures
The structural information of a protein is pivotal to comprehend its functions, protein–protein and protein–ligand interactions. There is a widening gap between the number of known protein sequences and that of experimentally determined structures. The protein structure prediction has emerged as an...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-01-01
|
Series: | Computational and Structural Biotechnology Journal |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2001037022005268 |
_version_ | 1828087918852308992 |
---|---|
author | Rahul Kaushik Kam Y.J. Zhang |
author_facet | Rahul Kaushik Kam Y.J. Zhang |
author_sort | Rahul Kaushik |
collection | DOAJ |
description | The structural information of a protein is pivotal to comprehend its functions, protein–protein and protein–ligand interactions. There is a widening gap between the number of known protein sequences and that of experimentally determined structures. The protein structure prediction has emerged as an efficient alternative to deliver the reliable structural information of proteins. However, it remains a challenge to identify the best model among the many predicted by one or a few structure prediction methods. Here we report ProFitFun-Meta, a neural network based pure single model scoring method for assessing the quality of predicted model structures by an effective combination structural information of various backbone dihedral angle and residue surface accessibility preferences of amino acid residues with other spatial properties of protein structures. The performance of ProFitFun-Meta was validated and benchmarked against current state-of-the-art methods on the extensive datasets, comprising a Test Dataset (n = 26,604), an External Dataset (n = 40,000), and CASP14 Dataset (n = 1200). The comprehensive performance evaluation of ProFitFun-Meta demonstrated its reliability and efficiency in terms of Spearman’s (ρ) and Pearson’s (r) correlation coefficients, GDT-TS loss (g), and absolute loss (d). An improved performance over the current state-of-the-art methods and leading performers of CASP14 experiment in quality assessment category demonstrated its potential to become an integral component of computational pipelines for protein modeling and design. The minimal dependencies, high computational efficiency, and portability to various Linux and Windows OS provide an additional edge to ProFitFun-Meta for its easy implementation and applications in various regimes of computational protein folding. |
first_indexed | 2024-04-11T05:18:52Z |
format | Article |
id | doaj.art-ce3de8e08fbc425c9e2418937d256f61 |
institution | Directory Open Access Journal |
issn | 2001-0370 |
language | English |
last_indexed | 2024-04-11T05:18:52Z |
publishDate | 2022-01-01 |
publisher | Elsevier |
record_format | Article |
series | Computational and Structural Biotechnology Journal |
spelling | doaj.art-ce3de8e08fbc425c9e2418937d256f612022-12-24T04:55:16ZengElsevierComputational and Structural Biotechnology Journal2001-03702022-01-012064676472An integrated protein structure fitness scoring approach for identifying native-like model structuresRahul Kaushik0Kam Y.J. Zhang1Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro, Yokohama, Kanagawa 230-0045, JapanCorresponding author.; Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro, Yokohama, Kanagawa 230-0045, JapanThe structural information of a protein is pivotal to comprehend its functions, protein–protein and protein–ligand interactions. There is a widening gap between the number of known protein sequences and that of experimentally determined structures. The protein structure prediction has emerged as an efficient alternative to deliver the reliable structural information of proteins. However, it remains a challenge to identify the best model among the many predicted by one or a few structure prediction methods. Here we report ProFitFun-Meta, a neural network based pure single model scoring method for assessing the quality of predicted model structures by an effective combination structural information of various backbone dihedral angle and residue surface accessibility preferences of amino acid residues with other spatial properties of protein structures. The performance of ProFitFun-Meta was validated and benchmarked against current state-of-the-art methods on the extensive datasets, comprising a Test Dataset (n = 26,604), an External Dataset (n = 40,000), and CASP14 Dataset (n = 1200). The comprehensive performance evaluation of ProFitFun-Meta demonstrated its reliability and efficiency in terms of Spearman’s (ρ) and Pearson’s (r) correlation coefficients, GDT-TS loss (g), and absolute loss (d). An improved performance over the current state-of-the-art methods and leading performers of CASP14 experiment in quality assessment category demonstrated its potential to become an integral component of computational pipelines for protein modeling and design. The minimal dependencies, high computational efficiency, and portability to various Linux and Windows OS provide an additional edge to ProFitFun-Meta for its easy implementation and applications in various regimes of computational protein folding.http://www.sciencedirect.com/science/article/pii/S2001037022005268Protein structure scoring functionStructure quality assessmentProtein structure modellingMachine learningComputational protein foldingComputational protein design |
spellingShingle | Rahul Kaushik Kam Y.J. Zhang An integrated protein structure fitness scoring approach for identifying native-like model structures Computational and Structural Biotechnology Journal Protein structure scoring function Structure quality assessment Protein structure modelling Machine learning Computational protein folding Computational protein design |
title | An integrated protein structure fitness scoring approach for identifying native-like model structures |
title_full | An integrated protein structure fitness scoring approach for identifying native-like model structures |
title_fullStr | An integrated protein structure fitness scoring approach for identifying native-like model structures |
title_full_unstemmed | An integrated protein structure fitness scoring approach for identifying native-like model structures |
title_short | An integrated protein structure fitness scoring approach for identifying native-like model structures |
title_sort | integrated protein structure fitness scoring approach for identifying native like model structures |
topic | Protein structure scoring function Structure quality assessment Protein structure modelling Machine learning Computational protein folding Computational protein design |
url | http://www.sciencedirect.com/science/article/pii/S2001037022005268 |
work_keys_str_mv | AT rahulkaushik anintegratedproteinstructurefitnessscoringapproachforidentifyingnativelikemodelstructures AT kamyjzhang anintegratedproteinstructurefitnessscoringapproachforidentifyingnativelikemodelstructures AT rahulkaushik integratedproteinstructurefitnessscoringapproachforidentifyingnativelikemodelstructures AT kamyjzhang integratedproteinstructurefitnessscoringapproachforidentifyingnativelikemodelstructures |