Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix

Abstract Background The biological process known as post-translational modification (PTM) is a condition whereby proteomes are modified that affects normal cell biology, and hence the pathogenesis. A number of PTMs have been discovered in the recent years and lysine phosphoglycerylation is one of th...

Full description

Bibliographic Details
Main Authors: Abel Chandra, Alok Sharma, Abdollah Dehzangi, Daichi Shigemizu, Tatsuhiko Tsunoda
Format: Article
Language:English
Published: BMC 2019-12-01
Series:BMC Molecular and Cell Biology
Subjects:
Online Access:https://doi.org/10.1186/s12860-019-0240-1
_version_ 1819275003515895808
author Abel Chandra
Alok Sharma
Abdollah Dehzangi
Daichi Shigemizu
Tatsuhiko Tsunoda
author_facet Abel Chandra
Alok Sharma
Abdollah Dehzangi
Daichi Shigemizu
Tatsuhiko Tsunoda
author_sort Abel Chandra
collection DOAJ
description Abstract Background The biological process known as post-translational modification (PTM) is a condition whereby proteomes are modified that affects normal cell biology, and hence the pathogenesis. A number of PTMs have been discovered in the recent years and lysine phosphoglycerylation is one of the fairly recent developments. Even with a large number of proteins being sequenced in the post-genomic era, the identification of phosphoglycerylation remains a big challenge due to factors such as cost, time consumption and inefficiency involved in the experimental efforts. To overcome this issue, computational techniques have emerged to accurately identify phosphoglycerylated lysine residues. However, the computational techniques proposed so far hold limitations to correctly predict this covalent modification. Results We propose a new predictor in this paper called Bigram-PGK which uses evolutionary information of amino acids to try and predict phosphoglycerylated sites. The benchmark dataset which contains experimentally labelled sites is employed for this purpose and profile bigram occurrences is calculated from position specific scoring matrices of amino acids in the protein sequences. The statistical measures of this work, such as sensitivity, specificity, precision, accuracy, Mathews correlation coefficient and area under ROC curve have been reported to be 0.9642, 0.8973, 0.8253, 0.9193, 0.8330, 0.9306, respectively. Conclusions The proposed predictor, based on the feature of evolutionary information and support vector machine classifier, has shown great potential to effectively predict phosphoglycerylated and non-phosphoglycerylated lysine residues when compared against the existing predictors. The data and software of this work can be acquired from https://github.com/abelavit/Bigram-PGK.
first_indexed 2024-12-23T23:17:25Z
format Article
id doaj.art-c428b4bc6e0a461098c849082c25c3a7
institution Directory Open Access Journal
issn 2661-8850
language English
last_indexed 2024-12-23T23:17:25Z
publishDate 2019-12-01
publisher BMC
record_format Article
series BMC Molecular and Cell Biology
spelling doaj.art-c428b4bc6e0a461098c849082c25c3a72022-12-21T17:26:28ZengBMCBMC Molecular and Cell Biology2661-88502019-12-0120S21910.1186/s12860-019-0240-1Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrixAbel Chandra0Alok Sharma1Abdollah Dehzangi2Daichi Shigemizu3Tatsuhiko Tsunoda4School of Engineering and Physics, Faculty of Science Technology and Environment, University of the South PacificSchool of Engineering and Physics, Faculty of Science Technology and Environment, University of the South PacificDepartment of Computer Science, Morgan State UniversityDepartment of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU)Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU)Abstract Background The biological process known as post-translational modification (PTM) is a condition whereby proteomes are modified that affects normal cell biology, and hence the pathogenesis. A number of PTMs have been discovered in the recent years and lysine phosphoglycerylation is one of the fairly recent developments. Even with a large number of proteins being sequenced in the post-genomic era, the identification of phosphoglycerylation remains a big challenge due to factors such as cost, time consumption and inefficiency involved in the experimental efforts. To overcome this issue, computational techniques have emerged to accurately identify phosphoglycerylated lysine residues. However, the computational techniques proposed so far hold limitations to correctly predict this covalent modification. Results We propose a new predictor in this paper called Bigram-PGK which uses evolutionary information of amino acids to try and predict phosphoglycerylated sites. The benchmark dataset which contains experimentally labelled sites is employed for this purpose and profile bigram occurrences is calculated from position specific scoring matrices of amino acids in the protein sequences. The statistical measures of this work, such as sensitivity, specificity, precision, accuracy, Mathews correlation coefficient and area under ROC curve have been reported to be 0.9642, 0.8973, 0.8253, 0.9193, 0.8330, 0.9306, respectively. Conclusions The proposed predictor, based on the feature of evolutionary information and support vector machine classifier, has shown great potential to effectively predict phosphoglycerylated and non-phosphoglycerylated lysine residues when compared against the existing predictors. The data and software of this work can be acquired from https://github.com/abelavit/Bigram-PGK.https://doi.org/10.1186/s12860-019-0240-1Post-translational modificationPhosphoglycerylationLysine residueComputational techniqueEvolutionary information
spellingShingle Abel Chandra
Alok Sharma
Abdollah Dehzangi
Daichi Shigemizu
Tatsuhiko Tsunoda
Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix
BMC Molecular and Cell Biology
Post-translational modification
Phosphoglycerylation
Lysine residue
Computational technique
Evolutionary information
title Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix
title_full Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix
title_fullStr Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix
title_full_unstemmed Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix
title_short Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix
title_sort bigram pgk phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix
topic Post-translational modification
Phosphoglycerylation
Lysine residue
Computational technique
Evolutionary information
url https://doi.org/10.1186/s12860-019-0240-1
work_keys_str_mv AT abelchandra bigrampgkphosphoglycerylationpredictionusingthetechniqueofbigramprobabilitiesofpositionspecificscoringmatrix
AT aloksharma bigrampgkphosphoglycerylationpredictionusingthetechniqueofbigramprobabilitiesofpositionspecificscoringmatrix
AT abdollahdehzangi bigrampgkphosphoglycerylationpredictionusingthetechniqueofbigramprobabilitiesofpositionspecificscoringmatrix
AT daichishigemizu bigrampgkphosphoglycerylationpredictionusingthetechniqueofbigramprobabilitiesofpositionspecificscoringmatrix
AT tatsuhikotsunoda bigrampgkphosphoglycerylationpredictionusingthetechniqueofbigramprobabilitiesofpositionspecificscoringmatrix