iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein
Protein phosphorylation is an important type of post-translational modification that regulates various activities of cell life inside human body. The accurate identification of phosphorylation sites can provide new insights for revealing the specific function of protein. However, it is time-consumin...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8903300/ |
_version_ | 1818854576513613824 |
---|---|
author | Shi-Hao Li Jun Zhang Ya-Wei Zhao Fu-Ying Dao Hui Ding Wei Chen Hua Tang |
author_facet | Shi-Hao Li Jun Zhang Ya-Wei Zhao Fu-Ying Dao Hui Ding Wei Chen Hua Tang |
author_sort | Shi-Hao Li |
collection | DOAJ |
description | Protein phosphorylation is an important type of post-translational modification that regulates various activities of cell life inside human body. The accurate identification of phosphorylation sites can provide new insights for revealing the specific function of protein. However, it is time-consuming and inefficient to apply the experiment-based techniques in investigating the phosphorylation sites in proteins. Additionally, computational approaches are regarded as an ideal choice in such a big data era. Therefore, in this work, we designed a new computational method to identify phosphorylation sites. At first, phosphorylation data was collected from human proteins to construct an objective and strict benchmark dataset. By a series of feature analysis, we found that the distributions of conservation scores and nine physicochemical properties surrounding the phosphorylation sites in positive samples are significantly different from those surrounding non-phosphorylation sites in negative samples. Based on these features, a novel sequence-based method for predicting the phosphorylation sites in human proteomics was proposed, which incorporated the conservation scores with position-associated attributes that reflect the correlation of physicochemical characteristics among amino acid residues. Furthermore, the analysis of variance (ANOVA) was utilized to obtain the optimal feature subset which could produce the highest accuracy. Comparison with the published predictor demonstrated the superiority of our predictor. Finally, a user-friendly online tool called iPhoPred was established and can be freely available at http://lin-group.cn/server/iPhoPred/. We hope the tool will provide important guide for the study of protein phosphorylation. |
first_indexed | 2024-12-19T07:54:54Z |
format | Article |
id | doaj.art-7acd5a96c6c4445b9626eba84b0a27a6 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T07:54:54Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-7acd5a96c6c4445b9626eba84b0a27a62022-12-21T20:30:02ZengIEEEIEEE Access2169-35362019-01-01717751717752810.1109/ACCESS.2019.29539518903300iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human ProteinShi-Hao Li0https://orcid.org/0000-0002-6857-7696Jun Zhang1https://orcid.org/0000-0001-6728-4544Ya-Wei Zhao2https://orcid.org/0000-0003-1827-6577Fu-Ying Dao3https://orcid.org/0000-0001-5285-6044Hui Ding4https://orcid.org/0000-0002-9607-9571Wei Chen5Hua Tang6Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, ChinaRehabilitation Department, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, ChinaCenter for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, ChinaCenter for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, ChinaCenter for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, ChinaInnovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, ChinaInnovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, ChinaProtein phosphorylation is an important type of post-translational modification that regulates various activities of cell life inside human body. The accurate identification of phosphorylation sites can provide new insights for revealing the specific function of protein. However, it is time-consuming and inefficient to apply the experiment-based techniques in investigating the phosphorylation sites in proteins. Additionally, computational approaches are regarded as an ideal choice in such a big data era. Therefore, in this work, we designed a new computational method to identify phosphorylation sites. At first, phosphorylation data was collected from human proteins to construct an objective and strict benchmark dataset. By a series of feature analysis, we found that the distributions of conservation scores and nine physicochemical properties surrounding the phosphorylation sites in positive samples are significantly different from those surrounding non-phosphorylation sites in negative samples. Based on these features, a novel sequence-based method for predicting the phosphorylation sites in human proteomics was proposed, which incorporated the conservation scores with position-associated attributes that reflect the correlation of physicochemical characteristics among amino acid residues. Furthermore, the analysis of variance (ANOVA) was utilized to obtain the optimal feature subset which could produce the highest accuracy. Comparison with the published predictor demonstrated the superiority of our predictor. Finally, a user-friendly online tool called iPhoPred was established and can be freely available at http://lin-group.cn/server/iPhoPred/. We hope the tool will provide important guide for the study of protein phosphorylation.https://ieeexplore.ieee.org/document/8903300/Phosphorylation sitephysicochemical propertyanalysis of variancesupport vector machinewebserver |
spellingShingle | Shi-Hao Li Jun Zhang Ya-Wei Zhao Fu-Ying Dao Hui Ding Wei Chen Hua Tang iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein IEEE Access Phosphorylation site physicochemical property analysis of variance support vector machine webserver |
title | iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein |
title_full | iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein |
title_fullStr | iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein |
title_full_unstemmed | iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein |
title_short | iPhoPred: A Predictor for Identifying Phosphorylation Sites in Human Protein |
title_sort | iphopred a predictor for identifying phosphorylation sites in human protein |
topic | Phosphorylation site physicochemical property analysis of variance support vector machine webserver |
url | https://ieeexplore.ieee.org/document/8903300/ |
work_keys_str_mv | AT shihaoli iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein AT junzhang iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein AT yaweizhao iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein AT fuyingdao iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein AT huiding iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein AT weichen iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein AT huatang iphopredapredictorforidentifyingphosphorylationsitesinhumanprotein |