Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms.

Recently, predicting proteins three-dimensional (3D) structure from its sequence information has made a significant progress due to the advances in computational techniques and the growth of experimental structures. However, selecting good models from a structural model pool is an important and chal...

Full description

Bibliographic Details
Main Authors: Balachandran Manavalan, Juyong Lee, Jooyoung Lee
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4164442?pdf=render
_version_ 1819125104807772160
author Balachandran Manavalan
Juyong Lee
Jooyoung Lee
author_facet Balachandran Manavalan
Juyong Lee
Jooyoung Lee
author_sort Balachandran Manavalan
collection DOAJ
description Recently, predicting proteins three-dimensional (3D) structure from its sequence information has made a significant progress due to the advances in computational techniques and the growth of experimental structures. However, selecting good models from a structural model pool is an important and challenging task in protein structure prediction. In this study, we present the first application of random forest based model quality assessment (RFMQA) to rank protein models using its structural features and knowledge-based potential energy terms. The method predicts a relative score of a model by using its secondary structure, solvent accessibility and knowledge-based potential energy terms. We trained and tested the RFMQA method on CASP8 and CASP9 targets using 5-fold cross-validation. The correlation coefficient between the TM-score of the model selected by RFMQA (TMRF) and the best server model (TMbest) is 0.945. We benchmarked our method on recent CASP10 targets by using CASP8 and 9 server models as a training set. The correlation coefficient and average difference between TMRF and TMbest over 95 CASP10 targets are 0.984 and 0.0385, respectively. The test results show that our method works better in selecting top models when compared with other top performing methods. RFMQA is available for download from http://lee.kias.re.kr/RFMQA/RFMQA_eval.tar.gz.
first_indexed 2024-12-22T07:34:50Z
format Article
id doaj.art-743e6e5194dc4dd2a8098f99a3571b73
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-22T07:34:50Z
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-743e6e5194dc4dd2a8098f99a3571b732022-12-21T18:33:55ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0199e10654210.1371/journal.pone.0106542Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms.Balachandran ManavalanJuyong LeeJooyoung LeeRecently, predicting proteins three-dimensional (3D) structure from its sequence information has made a significant progress due to the advances in computational techniques and the growth of experimental structures. However, selecting good models from a structural model pool is an important and challenging task in protein structure prediction. In this study, we present the first application of random forest based model quality assessment (RFMQA) to rank protein models using its structural features and knowledge-based potential energy terms. The method predicts a relative score of a model by using its secondary structure, solvent accessibility and knowledge-based potential energy terms. We trained and tested the RFMQA method on CASP8 and CASP9 targets using 5-fold cross-validation. The correlation coefficient between the TM-score of the model selected by RFMQA (TMRF) and the best server model (TMbest) is 0.945. We benchmarked our method on recent CASP10 targets by using CASP8 and 9 server models as a training set. The correlation coefficient and average difference between TMRF and TMbest over 95 CASP10 targets are 0.984 and 0.0385, respectively. The test results show that our method works better in selecting top models when compared with other top performing methods. RFMQA is available for download from http://lee.kias.re.kr/RFMQA/RFMQA_eval.tar.gz.http://europepmc.org/articles/PMC4164442?pdf=render
spellingShingle Balachandran Manavalan
Juyong Lee
Jooyoung Lee
Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms.
PLoS ONE
title Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms.
title_full Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms.
title_fullStr Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms.
title_full_unstemmed Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms.
title_short Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms.
title_sort random forest based protein model quality assessment rfmqa using structural features and potential energy terms
url http://europepmc.org/articles/PMC4164442?pdf=render
work_keys_str_mv AT balachandranmanavalan randomforestbasedproteinmodelqualityassessmentrfmqausingstructuralfeaturesandpotentialenergyterms
AT juyonglee randomforestbasedproteinmodelqualityassessmentrfmqausingstructuralfeaturesandpotentialenergyterms
AT jooyounglee randomforestbasedproteinmodelqualityassessmentrfmqausingstructuralfeaturesandpotentialenergyterms