Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction

In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomi...

Full description

Bibliographic Details
Main Authors: Chuan-Ming Liu, Van-Dai Ta, Nguyen Quoc Khanh Le, Direselign Addis Tadesse, Chongyang Shi
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Life
Subjects:
Online Access:https://www.mdpi.com/2075-1729/12/8/1213
_version_ 1797432056424169472
author Chuan-Ming Liu
Van-Dai Ta
Nguyen Quoc Khanh Le
Direselign Addis Tadesse
Chongyang Shi
author_facet Chuan-Ming Liu
Van-Dai Ta
Nguyen Quoc Khanh Le
Direselign Addis Tadesse
Chongyang Shi
author_sort Chuan-Ming Liu
collection DOAJ
description In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomics. This study aims to propose a novel deep neural network framework based on word embedding techniques for glutarylation sites prediction. Multiple deep neural network models are implemented to evaluate the performance of glutarylation sites prediction. Furthermore, an extensive experimental comparison of word embedding techniques is conducted to utilize the most efficient method for improving protein sequence data representation. The results suggest that the proposed deep neural networks not only improve protein sequence representation but also work effectively in glutarylation sites prediction by obtaining a higher accuracy and confidence rate compared to the previous work. Moreover, embedding techniques were proven to be more productive than the pre-trained word embedding techniques for glutarylation sequence representation. Our proposed method has significantly outperformed all traditional performance metrics compared to the advanced integrated vector support, with accuracy, specificity, sensitivity, and correlation coefficient of 0.79, 0.89, 0.59, and 0.51, respectively. It shows the potential to detect new glutarylation sites and uncover the relationships between glutarylation and well-known lysine modification.
first_indexed 2024-03-09T09:54:46Z
format Article
id doaj.art-dab3cd152b03454493d61dbb824f85a7
institution Directory Open Access Journal
issn 2075-1729
language English
last_indexed 2024-03-09T09:54:46Z
publishDate 2022-08-01
publisher MDPI AG
record_format Article
series Life
spelling doaj.art-dab3cd152b03454493d61dbb824f85a72023-12-01T23:54:34ZengMDPI AGLife2075-17292022-08-01128121310.3390/life12081213Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites PredictionChuan-Ming Liu0Van-Dai Ta1Nguyen Quoc Khanh Le2Direselign Addis Tadesse3Chongyang Shi4Department of Computer Science and Information Engineering, National Taipei University of Technology (Taipei Tech), Taipei City 106, TaiwanSamsung Display Vietnam (SDV), Yen Phong Industrial Park, Bac Ninh 16000, VietnamProfessional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei City 106, TaiwanInstitute of Technology, Debre Markos University, Debre Markos P.O. Box 269, EthiopiaSchool of Computer Science and Technology, Beijing Institute of Technology, Beijing 102488, ChinaIn recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomics. This study aims to propose a novel deep neural network framework based on word embedding techniques for glutarylation sites prediction. Multiple deep neural network models are implemented to evaluate the performance of glutarylation sites prediction. Furthermore, an extensive experimental comparison of word embedding techniques is conducted to utilize the most efficient method for improving protein sequence data representation. The results suggest that the proposed deep neural networks not only improve protein sequence representation but also work effectively in glutarylation sites prediction by obtaining a higher accuracy and confidence rate compared to the previous work. Moreover, embedding techniques were proven to be more productive than the pre-trained word embedding techniques for glutarylation sequence representation. Our proposed method has significantly outperformed all traditional performance metrics compared to the advanced integrated vector support, with accuracy, specificity, sensitivity, and correlation coefficient of 0.79, 0.89, 0.59, and 0.51, respectively. It shows the potential to detect new glutarylation sites and uncover the relationships between glutarylation and well-known lysine modification.https://www.mdpi.com/2075-1729/12/8/1213glutarylation site predictiondeep neural networksword embeddingLSTMELMoGloVe
spellingShingle Chuan-Ming Liu
Van-Dai Ta
Nguyen Quoc Khanh Le
Direselign Addis Tadesse
Chongyang Shi
Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
Life
glutarylation site prediction
deep neural networks
word embedding
LSTM
ELMo
GloVe
title Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_full Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_fullStr Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_full_unstemmed Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_short Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_sort deep neural network framework based on word embedding for protein glutarylation sites prediction
topic glutarylation site prediction
deep neural networks
word embedding
LSTM
ELMo
GloVe
url https://www.mdpi.com/2075-1729/12/8/1213
work_keys_str_mv AT chuanmingliu deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction
AT vandaita deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction
AT nguyenquockhanhle deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction
AT direselignaddistadesse deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction
AT chongyangshi deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction