Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomi...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-08-01
|
Series: | Life |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-1729/12/8/1213 |
_version_ | 1797432056424169472 |
---|---|
author | Chuan-Ming Liu Van-Dai Ta Nguyen Quoc Khanh Le Direselign Addis Tadesse Chongyang Shi |
author_facet | Chuan-Ming Liu Van-Dai Ta Nguyen Quoc Khanh Le Direselign Addis Tadesse Chongyang Shi |
author_sort | Chuan-Ming Liu |
collection | DOAJ |
description | In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomics. This study aims to propose a novel deep neural network framework based on word embedding techniques for glutarylation sites prediction. Multiple deep neural network models are implemented to evaluate the performance of glutarylation sites prediction. Furthermore, an extensive experimental comparison of word embedding techniques is conducted to utilize the most efficient method for improving protein sequence data representation. The results suggest that the proposed deep neural networks not only improve protein sequence representation but also work effectively in glutarylation sites prediction by obtaining a higher accuracy and confidence rate compared to the previous work. Moreover, embedding techniques were proven to be more productive than the pre-trained word embedding techniques for glutarylation sequence representation. Our proposed method has significantly outperformed all traditional performance metrics compared to the advanced integrated vector support, with accuracy, specificity, sensitivity, and correlation coefficient of 0.79, 0.89, 0.59, and 0.51, respectively. It shows the potential to detect new glutarylation sites and uncover the relationships between glutarylation and well-known lysine modification. |
first_indexed | 2024-03-09T09:54:46Z |
format | Article |
id | doaj.art-dab3cd152b03454493d61dbb824f85a7 |
institution | Directory Open Access Journal |
issn | 2075-1729 |
language | English |
last_indexed | 2024-03-09T09:54:46Z |
publishDate | 2022-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Life |
spelling | doaj.art-dab3cd152b03454493d61dbb824f85a72023-12-01T23:54:34ZengMDPI AGLife2075-17292022-08-01128121310.3390/life12081213Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites PredictionChuan-Ming Liu0Van-Dai Ta1Nguyen Quoc Khanh Le2Direselign Addis Tadesse3Chongyang Shi4Department of Computer Science and Information Engineering, National Taipei University of Technology (Taipei Tech), Taipei City 106, TaiwanSamsung Display Vietnam (SDV), Yen Phong Industrial Park, Bac Ninh 16000, VietnamProfessional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei City 106, TaiwanInstitute of Technology, Debre Markos University, Debre Markos P.O. Box 269, EthiopiaSchool of Computer Science and Technology, Beijing Institute of Technology, Beijing 102488, ChinaIn recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomics. This study aims to propose a novel deep neural network framework based on word embedding techniques for glutarylation sites prediction. Multiple deep neural network models are implemented to evaluate the performance of glutarylation sites prediction. Furthermore, an extensive experimental comparison of word embedding techniques is conducted to utilize the most efficient method for improving protein sequence data representation. The results suggest that the proposed deep neural networks not only improve protein sequence representation but also work effectively in glutarylation sites prediction by obtaining a higher accuracy and confidence rate compared to the previous work. Moreover, embedding techniques were proven to be more productive than the pre-trained word embedding techniques for glutarylation sequence representation. Our proposed method has significantly outperformed all traditional performance metrics compared to the advanced integrated vector support, with accuracy, specificity, sensitivity, and correlation coefficient of 0.79, 0.89, 0.59, and 0.51, respectively. It shows the potential to detect new glutarylation sites and uncover the relationships between glutarylation and well-known lysine modification.https://www.mdpi.com/2075-1729/12/8/1213glutarylation site predictiondeep neural networksword embeddingLSTMELMoGloVe |
spellingShingle | Chuan-Ming Liu Van-Dai Ta Nguyen Quoc Khanh Le Direselign Addis Tadesse Chongyang Shi Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction Life glutarylation site prediction deep neural networks word embedding LSTM ELMo GloVe |
title | Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction |
title_full | Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction |
title_fullStr | Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction |
title_full_unstemmed | Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction |
title_short | Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction |
title_sort | deep neural network framework based on word embedding for protein glutarylation sites prediction |
topic | glutarylation site prediction deep neural networks word embedding LSTM ELMo GloVe |
url | https://www.mdpi.com/2075-1729/12/8/1213 |
work_keys_str_mv | AT chuanmingliu deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction AT vandaita deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction AT nguyenquockhanhle deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction AT direselignaddistadesse deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction AT chongyangshi deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction |