Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments

<p>Abstract</p> <p>Background</p> <p><it>β</it>-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of <it>β</it>-turns from prot...

Full description

Bibliographic Details
Main Authors: Kurgan Lukasz, Zheng Ce
Format: Article
Language:English
Published: BMC 2008-10-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/9/430
_version_ 1818641553801871360
author Kurgan Lukasz
Zheng Ce
author_facet Kurgan Lukasz
Zheng Ce
author_sort Kurgan Lukasz
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p><it>β</it>-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of <it>β</it>-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based <it>β</it>-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor.</p> <p>Results</p> <p>We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential <it>β</it>-turns, while the remaining four amino acids are useful to predict non-<it>β</it>-turns. Empirical evaluation using three nonredundant datasets shows favorable Q<sub>total</sub>, Q<sub>predicted </sub>and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q<sub>total </sub>barrier and achieves Q<sub>total </sub>= 80.9%, MCC = 0.47, and Q<sub>predicted </sub>higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively.</p> <p>Conclusion</p> <p>Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between <it>β</it>-turns and non-<it>β</it>-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at <url>http://biomine.ece.ualberta.ca/BTNpred/BTNpred.html</url>.</p>
first_indexed 2024-12-16T23:29:00Z
format Article
id doaj.art-1ac22e83025a4fb584e997f0e75aaeda
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-16T23:29:00Z
publishDate 2008-10-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-1ac22e83025a4fb584e997f0e75aaeda2022-12-21T22:11:56ZengBMCBMC Bioinformatics1471-21052008-10-019143010.1186/1471-2105-9-430Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignmentsKurgan LukaszZheng Ce<p>Abstract</p> <p>Background</p> <p><it>β</it>-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of <it>β</it>-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based <it>β</it>-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor.</p> <p>Results</p> <p>We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential <it>β</it>-turns, while the remaining four amino acids are useful to predict non-<it>β</it>-turns. Empirical evaluation using three nonredundant datasets shows favorable Q<sub>total</sub>, Q<sub>predicted </sub>and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q<sub>total </sub>barrier and achieves Q<sub>total </sub>= 80.9%, MCC = 0.47, and Q<sub>predicted </sub>higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively.</p> <p>Conclusion</p> <p>Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between <it>β</it>-turns and non-<it>β</it>-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at <url>http://biomine.ece.ualberta.ca/BTNpred/BTNpred.html</url>.</p>http://www.biomedcentral.com/1471-2105/9/430
spellingShingle Kurgan Lukasz
Zheng Ce
Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
BMC Bioinformatics
title Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_full Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_fullStr Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_full_unstemmed Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_short Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_sort prediction of beta turns at over 80 accuracy based on an ensemble of predicted secondary structures and multiple alignments
url http://www.biomedcentral.com/1471-2105/9/430
work_keys_str_mv AT kurganlukasz predictionofbetaturnsatover80accuracybasedonanensembleofpredictedsecondarystructuresandmultiplealignments
AT zhengce predictionofbetaturnsatover80accuracybasedonanensembleofpredictedsecondarystructuresandmultiplealignments