Prediction of protein structural classes for low-homology sequences based on predicted secondary structure

Abstract Background Prediction of protein structural classes (<it>α</it>, <it>β</it>, <it>α </it>+ <it>β </it>and <it>α</it>/<it>β</it>) from amino acid sequences is of great import...

Full description

Bibliographic Details
Main Authors:	Chen Xin, Peng Zhen-Ling, Yang Jian-Yi
Format:	Article
Language:	English
Published:	BMC 2010-01-01
Series:	BMC Bioinformatics

_version_	1811266135146889216
author	Chen Xin Peng Zhen-Ling Yang Jian-Yi
author_facet	Chen Xin Peng Zhen-Ling Yang Jian-Yi
author_sort	Chen Xin
collection	DOAJ
description	<p>Abstract</p> <p>Background</p> <p>Prediction of protein structural classes (<it>α</it>, <it>β</it>, <it>α </it>+ <it>β </it>and <it>α</it>/<it>β</it>) from amino acid sequences is of great importance, as it is beneficial to study protein function, regulation and interactions. Many methods have been developed for high-homology protein sequences, and the prediction accuracies can achieve up to 90%. However, for low-homology sequences whose average pairwise sequence identity lies between 20% and 40%, they perform relatively poorly, yielding the prediction accuracy often below 60%.</p> <p>Results</p> <p>We propose a new method to predict protein structural classes on the basis of features extracted from the predicted secondary structures of proteins rather than directly from their amino acid sequences. It first uses PSIPRED to predict the secondary structure for each protein sequence. Then, the <it>chaos game representation </it>is employed to represent the predicted secondary structure as two time series, from which we generate a comprehensive set of 24 features using <it>recurrence quantification analysis</it>, <it>K-string based information entropy </it>and <it>segment-based analysis</it>. The resulting feature vectors are finally fed into a simple yet powerful Fisher's discriminant algorithm for the prediction of protein structural classes. We tested the proposed method on three benchmark datasets in low homology and achieved the overall prediction accuracies of 82.9%, 83.1% and 81.3%, respectively. Comparisons with ten existing methods showed that our method consistently performs better for all the tested datasets and the overall accuracy improvements range from 2.3% to 27.5%. A web server that implements the proposed method is freely available at <url>http://www1.spms.ntu.edu.sg/~chenxin/RKS_PPSC/</url>.</p> <p>Conclusion</p> <p>The high prediction accuracy achieved by our proposed method is attributed to the design of a comprehensive feature set on the predicted secondary structure sequences, which is capable of characterizing the sequence order information, local interactions of the secondary structural elements, and spacial arrangements of <it>α </it>helices and <it>β </it>strands. Thus, it is a valuable method to predict protein structural classes particularly for low-homology amino acid sequences.</p>
first_indexed	2024-04-12T20:37:39Z
format	Article
id	doaj.art-e36a5884ad66412aaa46aca0d0fc99e5
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-04-12T20:37:39Z
publishDate	2010-01-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-e36a5884ad66412aaa46aca0d0fc99e52022-12-22T03:17:33ZengBMCBMC Bioinformatics1471-21052010-01-0111Suppl 1S910.1186/1471-2105-11-S1-S9Prediction of protein structural classes for low-homology sequences based on predicted secondary structureChen XinPeng Zhen-LingYang Jian-Yi<p>Abstract</p> <p>Background</p> <p>Prediction of protein structural classes (<it>α</it>, <it>β</it>, <it>α </it>+ <it>β </it>and <it>α</it>/<it>β</it>) from amino acid sequences is of great importance, as it is beneficial to study protein function, regulation and interactions. Many methods have been developed for high-homology protein sequences, and the prediction accuracies can achieve up to 90%. However, for low-homology sequences whose average pairwise sequence identity lies between 20% and 40%, they perform relatively poorly, yielding the prediction accuracy often below 60%.</p> <p>Results</p> <p>We propose a new method to predict protein structural classes on the basis of features extracted from the predicted secondary structures of proteins rather than directly from their amino acid sequences. It first uses PSIPRED to predict the secondary structure for each protein sequence. Then, the <it>chaos game representation </it>is employed to represent the predicted secondary structure as two time series, from which we generate a comprehensive set of 24 features using <it>recurrence quantification analysis</it>, <it>K-string based information entropy </it>and <it>segment-based analysis</it>. The resulting feature vectors are finally fed into a simple yet powerful Fisher's discriminant algorithm for the prediction of protein structural classes. We tested the proposed method on three benchmark datasets in low homology and achieved the overall prediction accuracies of 82.9%, 83.1% and 81.3%, respectively. Comparisons with ten existing methods showed that our method consistently performs better for all the tested datasets and the overall accuracy improvements range from 2.3% to 27.5%. A web server that implements the proposed method is freely available at <url>http://www1.spms.ntu.edu.sg/~chenxin/RKS_PPSC/</url>.</p> <p>Conclusion</p> <p>The high prediction accuracy achieved by our proposed method is attributed to the design of a comprehensive feature set on the predicted secondary structure sequences, which is capable of characterizing the sequence order information, local interactions of the secondary structural elements, and spacial arrangements of <it>α </it>helices and <it>β </it>strands. Thus, it is a valuable method to predict protein structural classes particularly for low-homology amino acid sequences.</p>
spellingShingle	Chen Xin Peng Zhen-Ling Yang Jian-Yi Prediction of protein structural classes for low-homology sequences based on predicted secondary structure BMC Bioinformatics
title	Prediction of protein structural classes for low-homology sequences based on predicted secondary structure
title_full	Prediction of protein structural classes for low-homology sequences based on predicted secondary structure
title_fullStr	Prediction of protein structural classes for low-homology sequences based on predicted secondary structure
title_full_unstemmed	Prediction of protein structural classes for low-homology sequences based on predicted secondary structure
title_short	Prediction of protein structural classes for low-homology sequences based on predicted secondary structure
title_sort	prediction of protein structural classes for low homology sequences based on predicted secondary structure
work_keys_str_mv	AT chenxin predictionofproteinstructuralclassesforlowhomologysequencesbasedonpredictedsecondarystructure AT pengzhenling predictionofproteinstructuralclassesforlowhomologysequencesbasedonpredictedsecondarystructure AT yangjianyi predictionofproteinstructuralclassesforlowhomologysequencesbasedonpredictedsecondarystructure

Prediction of protein structural classes for low-homology sequences based on predicted secondary structure

Similar Items