An integrated protein structure fitness scoring approach for identifying native-like model structures

The structural information of a protein is pivotal to comprehend its functions, protein–protein and protein–ligand interactions. There is a widening gap between the number of known protein sequences and that of experimentally determined structures. The protein structure prediction has emerged as an...

Full description

Bibliographic Details
Main Authors: Rahul Kaushik, Kam Y.J. Zhang
Format: Article
Language:English
Published: Elsevier 2022-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037022005268
_version_ 1828087918852308992
author Rahul Kaushik
Kam Y.J. Zhang
author_facet Rahul Kaushik
Kam Y.J. Zhang
author_sort Rahul Kaushik
collection DOAJ
description The structural information of a protein is pivotal to comprehend its functions, protein–protein and protein–ligand interactions. There is a widening gap between the number of known protein sequences and that of experimentally determined structures. The protein structure prediction has emerged as an efficient alternative to deliver the reliable structural information of proteins. However, it remains a challenge to identify the best model among the many predicted by one or a few structure prediction methods. Here we report ProFitFun-Meta, a neural network based pure single model scoring method for assessing the quality of predicted model structures by an effective combination structural information of various backbone dihedral angle and residue surface accessibility preferences of amino acid residues with other spatial properties of protein structures. The performance of ProFitFun-Meta was validated and benchmarked against current state-of-the-art methods on the extensive datasets, comprising a Test Dataset (n = 26,604), an External Dataset (n = 40,000), and CASP14 Dataset (n = 1200). The comprehensive performance evaluation of ProFitFun-Meta demonstrated its reliability and efficiency in terms of Spearman’s (ρ) and Pearson’s (r) correlation coefficients, GDT-TS loss (g), and absolute loss (d). An improved performance over the current state-of-the-art methods and leading performers of CASP14 experiment in quality assessment category demonstrated its potential to become an integral component of computational pipelines for protein modeling and design. The minimal dependencies, high computational efficiency, and portability to various Linux and Windows OS provide an additional edge to ProFitFun-Meta for its easy implementation and applications in various regimes of computational protein folding.
first_indexed 2024-04-11T05:18:52Z
format Article
id doaj.art-ce3de8e08fbc425c9e2418937d256f61
institution Directory Open Access Journal
issn 2001-0370
language English
last_indexed 2024-04-11T05:18:52Z
publishDate 2022-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj.art-ce3de8e08fbc425c9e2418937d256f612022-12-24T04:55:16ZengElsevierComputational and Structural Biotechnology Journal2001-03702022-01-012064676472An integrated protein structure fitness scoring approach for identifying native-like model structuresRahul Kaushik0Kam Y.J. Zhang1Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro, Yokohama, Kanagawa 230-0045, JapanCorresponding author.; Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro, Yokohama, Kanagawa 230-0045, JapanThe structural information of a protein is pivotal to comprehend its functions, protein–protein and protein–ligand interactions. There is a widening gap between the number of known protein sequences and that of experimentally determined structures. The protein structure prediction has emerged as an efficient alternative to deliver the reliable structural information of proteins. However, it remains a challenge to identify the best model among the many predicted by one or a few structure prediction methods. Here we report ProFitFun-Meta, a neural network based pure single model scoring method for assessing the quality of predicted model structures by an effective combination structural information of various backbone dihedral angle and residue surface accessibility preferences of amino acid residues with other spatial properties of protein structures. The performance of ProFitFun-Meta was validated and benchmarked against current state-of-the-art methods on the extensive datasets, comprising a Test Dataset (n = 26,604), an External Dataset (n = 40,000), and CASP14 Dataset (n = 1200). The comprehensive performance evaluation of ProFitFun-Meta demonstrated its reliability and efficiency in terms of Spearman’s (ρ) and Pearson’s (r) correlation coefficients, GDT-TS loss (g), and absolute loss (d). An improved performance over the current state-of-the-art methods and leading performers of CASP14 experiment in quality assessment category demonstrated its potential to become an integral component of computational pipelines for protein modeling and design. The minimal dependencies, high computational efficiency, and portability to various Linux and Windows OS provide an additional edge to ProFitFun-Meta for its easy implementation and applications in various regimes of computational protein folding.http://www.sciencedirect.com/science/article/pii/S2001037022005268Protein structure scoring functionStructure quality assessmentProtein structure modellingMachine learningComputational protein foldingComputational protein design
spellingShingle Rahul Kaushik
Kam Y.J. Zhang
An integrated protein structure fitness scoring approach for identifying native-like model structures
Computational and Structural Biotechnology Journal
Protein structure scoring function
Structure quality assessment
Protein structure modelling
Machine learning
Computational protein folding
Computational protein design
title An integrated protein structure fitness scoring approach for identifying native-like model structures
title_full An integrated protein structure fitness scoring approach for identifying native-like model structures
title_fullStr An integrated protein structure fitness scoring approach for identifying native-like model structures
title_full_unstemmed An integrated protein structure fitness scoring approach for identifying native-like model structures
title_short An integrated protein structure fitness scoring approach for identifying native-like model structures
title_sort integrated protein structure fitness scoring approach for identifying native like model structures
topic Protein structure scoring function
Structure quality assessment
Protein structure modelling
Machine learning
Computational protein folding
Computational protein design
url http://www.sciencedirect.com/science/article/pii/S2001037022005268
work_keys_str_mv AT rahulkaushik anintegratedproteinstructurefitnessscoringapproachforidentifyingnativelikemodelstructures
AT kamyjzhang anintegratedproteinstructurefitnessscoringapproachforidentifyingnativelikemodelstructures
AT rahulkaushik integratedproteinstructurefitnessscoringapproachforidentifyingnativelikemodelstructures
AT kamyjzhang integratedproteinstructurefitnessscoringapproachforidentifyingnativelikemodelstructures