IMKPse: Identification of Protein Malonylation Sites by the Key Features Into General PseAAC

Currently, lysine malonylation is treated as one of the most key protein post translational modification in the field of biology and lysine plays a significant role for the regulation of several biological processions. Therefore, accurately identification such modification type will make contributio...

Full description

Bibliographic Details
Main Authors: Wenzheng Bao, Bin Yang, De-Shuang Huang, Dong Wang, Qi Liu, Yue-Hui Chen, Rong Bao
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8660642/
_version_ 1818619258389659648
author Wenzheng Bao
Bin Yang
De-Shuang Huang
Dong Wang
Qi Liu
Yue-Hui Chen
Rong Bao
author_facet Wenzheng Bao
Bin Yang
De-Shuang Huang
Dong Wang
Qi Liu
Yue-Hui Chen
Rong Bao
author_sort Wenzheng Bao
collection DOAJ
description Currently, lysine malonylation is treated as one of the most key protein post translational modification in the field of biology and lysine plays a significant role for the regulation of several biological processions. Therefore, accurately identification such modification type will make contributions to understanding their biological processions in this field. The experimental approaches to identify such type of modification sites are time-wasting and laborious in some degree. So, it is necessary and urgent to design and propose computational biology approaches to identify these sites. In this paper, we proposed the IMKPse model that utilized general PseAAC as the classification features and employed flexible neural tree as classification model. In order to deal with the overfitting problem, we utilized the independent datasets of each species. More specifically, such algorithm initially employed amino acid properties from the general PseAAC as the candidate features. With the comparison of candidate features, such a method has the ability to finding out the top five features among them. When evaluated on three data sets in testing set, IMKPse obtained MCC value of 0.9185, 0.9097, and 0.9525 in three species, including E.coli, M.musculus, and H.sapiens, respectively. Meanwhile, IMKPse obtained MCC value of 0.9149, 0.9060, and 0.9467, respectively, in the independent sets. In addition, then, we make some combinations among the top five features. The results demonstrate that the proposed algorithm has superior performances than other approaches. A user-friendly web resource of IMKPSE is available at http://121.250.173.184.
first_indexed 2024-12-16T17:34:38Z
format Article
id doaj.art-f00bde19fd2c410aaaae2e2bc4010649
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-16T17:34:38Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-f00bde19fd2c410aaaae2e2bc40106492022-12-21T22:22:51ZengIEEEIEEE Access2169-35362019-01-017540735408310.1109/ACCESS.2019.29002758660642IMKPse: Identification of Protein Malonylation Sites by the Key Features Into General PseAACWenzheng Bao0https://orcid.org/0000-0002-1471-5432Bin Yang1De-Shuang Huang2Dong Wang3Qi Liu4Yue-Hui Chen5Rong Bao6School of Information and Electrical Engineering, Xuzhou University of Technology, Xuzhou, ChinaSchool of Information Science and Engineering, Zaozhuang University, Zaozhuang, ChinaInstitute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai, ChinaSchool of Information Science, University of Jinan, Jinan, ChinaAffiliated Hospital, Xuzhou Medical University, Xuzhou, ChinaSchool of Information Science, University of Jinan, Jinan, ChinaSchool of Information and Electrical Engineering, Xuzhou University of Technology, Xuzhou, ChinaCurrently, lysine malonylation is treated as one of the most key protein post translational modification in the field of biology and lysine plays a significant role for the regulation of several biological processions. Therefore, accurately identification such modification type will make contributions to understanding their biological processions in this field. The experimental approaches to identify such type of modification sites are time-wasting and laborious in some degree. So, it is necessary and urgent to design and propose computational biology approaches to identify these sites. In this paper, we proposed the IMKPse model that utilized general PseAAC as the classification features and employed flexible neural tree as classification model. In order to deal with the overfitting problem, we utilized the independent datasets of each species. More specifically, such algorithm initially employed amino acid properties from the general PseAAC as the candidate features. With the comparison of candidate features, such a method has the ability to finding out the top five features among them. When evaluated on three data sets in testing set, IMKPse obtained MCC value of 0.9185, 0.9097, and 0.9525 in three species, including E.coli, M.musculus, and H.sapiens, respectively. Meanwhile, IMKPse obtained MCC value of 0.9149, 0.9060, and 0.9467, respectively, in the independent sets. In addition, then, we make some combinations among the top five features. The results demonstrate that the proposed algorithm has superior performances than other approaches. A user-friendly web resource of IMKPSE is available at http://121.250.173.184.https://ieeexplore.ieee.org/document/8660642/Post translational modificationamino acid residues identificationflexible neural tree
spellingShingle Wenzheng Bao
Bin Yang
De-Shuang Huang
Dong Wang
Qi Liu
Yue-Hui Chen
Rong Bao
IMKPse: Identification of Protein Malonylation Sites by the Key Features Into General PseAAC
IEEE Access
Post translational modification
amino acid residues identification
flexible neural tree
title IMKPse: Identification of Protein Malonylation Sites by the Key Features Into General PseAAC
title_full IMKPse: Identification of Protein Malonylation Sites by the Key Features Into General PseAAC
title_fullStr IMKPse: Identification of Protein Malonylation Sites by the Key Features Into General PseAAC
title_full_unstemmed IMKPse: Identification of Protein Malonylation Sites by the Key Features Into General PseAAC
title_short IMKPse: Identification of Protein Malonylation Sites by the Key Features Into General PseAAC
title_sort imkpse identification of protein malonylation sites by the key features into general pseaac
topic Post translational modification
amino acid residues identification
flexible neural tree
url https://ieeexplore.ieee.org/document/8660642/
work_keys_str_mv AT wenzhengbao imkpseidentificationofproteinmalonylationsitesbythekeyfeaturesintogeneralpseaac
AT binyang imkpseidentificationofproteinmalonylationsitesbythekeyfeaturesintogeneralpseaac
AT deshuanghuang imkpseidentificationofproteinmalonylationsitesbythekeyfeaturesintogeneralpseaac
AT dongwang imkpseidentificationofproteinmalonylationsitesbythekeyfeaturesintogeneralpseaac
AT qiliu imkpseidentificationofproteinmalonylationsitesbythekeyfeaturesintogeneralpseaac
AT yuehuichen imkpseidentificationofproteinmalonylationsitesbythekeyfeaturesintogeneralpseaac
AT rongbao imkpseidentificationofproteinmalonylationsitesbythekeyfeaturesintogeneralpseaac