iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein

Protein phosphorylation is an important type of post-translational modification that regulates various activities of cell life inside human body. The accurate identification of phosphorylation sites can provide new insights for revealing the specific function of protein. However, it is time-consumin...

Full description

Bibliographic Details
Main Authors: Shi-Hao Li, Jun Zhang, Ya-Wei Zhao, Fu-Ying Dao, Hui Ding, Wei Chen, Hua Tang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8903300/
_version_ 1818854576513613824
author Shi-Hao Li
Jun Zhang
Ya-Wei Zhao
Fu-Ying Dao
Hui Ding
Wei Chen
Hua Tang
author_facet Shi-Hao Li
Jun Zhang
Ya-Wei Zhao
Fu-Ying Dao
Hui Ding
Wei Chen
Hua Tang
author_sort Shi-Hao Li
collection DOAJ
description Protein phosphorylation is an important type of post-translational modification that regulates various activities of cell life inside human body. The accurate identification of phosphorylation sites can provide new insights for revealing the specific function of protein. However, it is time-consuming and inefficient to apply the experiment-based techniques in investigating the phosphorylation sites in proteins. Additionally, computational approaches are regarded as an ideal choice in such a big data era. Therefore, in this work, we designed a new computational method to identify phosphorylation sites. At first, phosphorylation data was collected from human proteins to construct an objective and strict benchmark dataset. By a series of feature analysis, we found that the distributions of conservation scores and nine physicochemical properties surrounding the phosphorylation sites in positive samples are significantly different from those surrounding non-phosphorylation sites in negative samples. Based on these features, a novel sequence-based method for predicting the phosphorylation sites in human proteomics was proposed, which incorporated the conservation scores with position-associated attributes that reflect the correlation of physicochemical characteristics among amino acid residues. Furthermore, the analysis of variance (ANOVA) was utilized to obtain the optimal feature subset which could produce the highest accuracy. Comparison with the published predictor demonstrated the superiority of our predictor. Finally, a user-friendly online tool called iPhoPred was established and can be freely available at http://lin-group.cn/server/iPhoPred/. We hope the tool will provide important guide for the study of protein phosphorylation.
first_indexed 2024-12-19T07:54:54Z
format Article
id doaj.art-7acd5a96c6c4445b9626eba84b0a27a6
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T07:54:54Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-7acd5a96c6c4445b9626eba84b0a27a62022-12-21T20:30:02ZengIEEEIEEE Access2169-35362019-01-01717751717752810.1109/ACCESS.2019.29539518903300iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human ProteinShi-Hao Li0https://orcid.org/0000-0002-6857-7696Jun Zhang1https://orcid.org/0000-0001-6728-4544Ya-Wei Zhao2https://orcid.org/0000-0003-1827-6577Fu-Ying Dao3https://orcid.org/0000-0001-5285-6044Hui Ding4https://orcid.org/0000-0002-9607-9571Wei Chen5Hua Tang6Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, ChinaRehabilitation Department, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, ChinaCenter for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, ChinaCenter for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, ChinaCenter for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, ChinaInnovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, ChinaInnovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, ChinaProtein phosphorylation is an important type of post-translational modification that regulates various activities of cell life inside human body. The accurate identification of phosphorylation sites can provide new insights for revealing the specific function of protein. However, it is time-consuming and inefficient to apply the experiment-based techniques in investigating the phosphorylation sites in proteins. Additionally, computational approaches are regarded as an ideal choice in such a big data era. Therefore, in this work, we designed a new computational method to identify phosphorylation sites. At first, phosphorylation data was collected from human proteins to construct an objective and strict benchmark dataset. By a series of feature analysis, we found that the distributions of conservation scores and nine physicochemical properties surrounding the phosphorylation sites in positive samples are significantly different from those surrounding non-phosphorylation sites in negative samples. Based on these features, a novel sequence-based method for predicting the phosphorylation sites in human proteomics was proposed, which incorporated the conservation scores with position-associated attributes that reflect the correlation of physicochemical characteristics among amino acid residues. Furthermore, the analysis of variance (ANOVA) was utilized to obtain the optimal feature subset which could produce the highest accuracy. Comparison with the published predictor demonstrated the superiority of our predictor. Finally, a user-friendly online tool called iPhoPred was established and can be freely available at http://lin-group.cn/server/iPhoPred/. We hope the tool will provide important guide for the study of protein phosphorylation.https://ieeexplore.ieee.org/document/8903300/Phosphorylation sitephysicochemical propertyanalysis of variancesupport vector machinewebserver
spellingShingle Shi-Hao Li
Jun Zhang
Ya-Wei Zhao
Fu-Ying Dao
Hui Ding
Wei Chen
Hua Tang
iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein
IEEE Access
Phosphorylation site
physicochemical property
analysis of variance
support vector machine
webserver
title iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein
title_full iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein
title_fullStr iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein
title_full_unstemmed iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein
title_short iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein
title_sort iphopred a predictor for identifying phosphorylation sites in human protein
topic Phosphorylation site
physicochemical property
analysis of variance
support vector machine
webserver
url https://ieeexplore.ieee.org/document/8903300/
work_keys_str_mv AT shihaoli iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein
AT junzhang iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein
AT yaweizhao iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein
AT fuyingdao iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein
AT huiding iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein
AT weichen iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein
AT huatang iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein