Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments

<p>Abstract</p> <p>Background</p> <p><it>β</it>-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of <it>β</it>-turns from prot...

Full description

Bibliographic Details
Main Authors: Kurgan Lukasz, Zheng Ce
Format: Article
Language:English
Published: BMC 2008-10-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/9/430
Description
Summary:<p>Abstract</p> <p>Background</p> <p><it>β</it>-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of <it>β</it>-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based <it>β</it>-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor.</p> <p>Results</p> <p>We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential <it>β</it>-turns, while the remaining four amino acids are useful to predict non-<it>β</it>-turns. Empirical evaluation using three nonredundant datasets shows favorable Q<sub>total</sub>, Q<sub>predicted </sub>and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q<sub>total </sub>barrier and achieves Q<sub>total </sub>= 80.9%, MCC = 0.47, and Q<sub>predicted </sub>higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively.</p> <p>Conclusion</p> <p>Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between <it>β</it>-turns and non-<it>β</it>-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at <url>http://biomine.ece.ualberta.ca/BTNpred/BTNpred.html</url>.</p>
ISSN:1471-2105