Glycosylation site prediction using ensembles of Support Vector Machine classifiers

<p>Abstract</p> <p>Background</p> <p>Glycosylation is one of the most complex post-translational modifications (PTMs) of proteins in eukaryotic cells. Glycosylation plays an important role in biological processes ranging from protein folding and subcellular localization...

Full description

Bibliographic Details
Main Authors: Silvescu Adrian, Sinapov Jivko, Caragea Cornelia, Dobbs Drena, Honavar Vasant
Format: Article
Language:English
Published: BMC 2007-11-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/8/438
_version_ 1818565916748677120
author Silvescu Adrian
Sinapov Jivko
Caragea Cornelia
Dobbs Drena
Honavar Vasant
author_facet Silvescu Adrian
Sinapov Jivko
Caragea Cornelia
Dobbs Drena
Honavar Vasant
author_sort Silvescu Adrian
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Glycosylation is one of the most complex post-translational modifications (PTMs) of proteins in eukaryotic cells. Glycosylation plays an important role in biological processes ranging from protein folding and subcellular localization, to ligand recognition and cell-cell interactions. Experimental identification of glycosylation sites is expensive and laborious. Hence, there is significant interest in the development of computational methods for reliable prediction of glycosylation sites from amino acid sequences.</p> <p>Results</p> <p>We explore machine learning methods for training classifiers to predict the amino acid residues that are likely to be glycosylated using information derived from the target amino acid residue and its sequence neighbors. We compare the performance of Support Vector Machine classifiers and ensembles of Support Vector Machine classifiers trained on a dataset of experimentally determined N-linked, O-linked, and C-linked glycosylation sites extracted from O-GlycBase version 6.00, a database of 242 proteins from several different species. The results of our experiments show that the ensembles of Support Vector Machine classifiers outperform single Support Vector Machine classifiers on the problem of predicting glycosylation sites in terms of a range of standard measures for comparing the performance of classifiers. The resulting methods have been implemented in <it>EnsembleGly</it>, a web server for glycosylation site prediction.</p> <p>Conclusion</p> <p><it>Ensembles of Support Vector Machine classifiers </it>offer an accurate and reliable approach to automated identification of putative glycosylation sites in glycoprotein sequences.</p>
first_indexed 2024-12-14T01:47:00Z
format Article
id doaj.art-4d291d62d668465fbc2f073b6b4d883b
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-14T01:47:00Z
publishDate 2007-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-4d291d62d668465fbc2f073b6b4d883b2022-12-21T23:21:31ZengBMCBMC Bioinformatics1471-21052007-11-018143810.1186/1471-2105-8-438Glycosylation site prediction using ensembles of Support Vector Machine classifiersSilvescu AdrianSinapov JivkoCaragea CorneliaDobbs DrenaHonavar Vasant<p>Abstract</p> <p>Background</p> <p>Glycosylation is one of the most complex post-translational modifications (PTMs) of proteins in eukaryotic cells. Glycosylation plays an important role in biological processes ranging from protein folding and subcellular localization, to ligand recognition and cell-cell interactions. Experimental identification of glycosylation sites is expensive and laborious. Hence, there is significant interest in the development of computational methods for reliable prediction of glycosylation sites from amino acid sequences.</p> <p>Results</p> <p>We explore machine learning methods for training classifiers to predict the amino acid residues that are likely to be glycosylated using information derived from the target amino acid residue and its sequence neighbors. We compare the performance of Support Vector Machine classifiers and ensembles of Support Vector Machine classifiers trained on a dataset of experimentally determined N-linked, O-linked, and C-linked glycosylation sites extracted from O-GlycBase version 6.00, a database of 242 proteins from several different species. The results of our experiments show that the ensembles of Support Vector Machine classifiers outperform single Support Vector Machine classifiers on the problem of predicting glycosylation sites in terms of a range of standard measures for comparing the performance of classifiers. The resulting methods have been implemented in <it>EnsembleGly</it>, a web server for glycosylation site prediction.</p> <p>Conclusion</p> <p><it>Ensembles of Support Vector Machine classifiers </it>offer an accurate and reliable approach to automated identification of putative glycosylation sites in glycoprotein sequences.</p>http://www.biomedcentral.com/1471-2105/8/438
spellingShingle Silvescu Adrian
Sinapov Jivko
Caragea Cornelia
Dobbs Drena
Honavar Vasant
Glycosylation site prediction using ensembles of Support Vector Machine classifiers
BMC Bioinformatics
title Glycosylation site prediction using ensembles of Support Vector Machine classifiers
title_full Glycosylation site prediction using ensembles of Support Vector Machine classifiers
title_fullStr Glycosylation site prediction using ensembles of Support Vector Machine classifiers
title_full_unstemmed Glycosylation site prediction using ensembles of Support Vector Machine classifiers
title_short Glycosylation site prediction using ensembles of Support Vector Machine classifiers
title_sort glycosylation site prediction using ensembles of support vector machine classifiers
url http://www.biomedcentral.com/1471-2105/8/438
work_keys_str_mv AT silvescuadrian glycosylationsitepredictionusingensemblesofsupportvectormachineclassifiers
AT sinapovjivko glycosylationsitepredictionusingensemblesofsupportvectormachineclassifiers
AT carageacornelia glycosylationsitepredictionusingensemblesofsupportvectormachineclassifiers
AT dobbsdrena glycosylationsitepredictionusingensemblesofsupportvectormachineclassifiers
AT honavarvasant glycosylationsitepredictionusingensemblesofsupportvectormachineclassifiers