NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, su...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2023-07-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2023.1226905/full |
_version_ | 1797770934999842816 |
---|---|
author | Di Liu Zhengkui Lin Cangzhi Jia |
author_facet | Di Liu Zhengkui Lin Cangzhi Jia |
author_sort | Di Liu |
collection | DOAJ |
description | Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides. |
first_indexed | 2024-03-12T21:30:11Z |
format | Article |
id | doaj.art-02890004027c4ec199b3aa9ed3c8d4ce |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-03-12T21:30:11Z |
publishDate | 2023-07-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-02890004027c4ec199b3aa9ed3c8d4ce2023-07-27T21:35:55ZengFrontiers Media S.A.Frontiers in Genetics1664-80212023-07-011410.3389/fgene.2023.12269051226905NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive BayesDi Liu0Zhengkui Lin1Cangzhi Jia2Information Science and Technology College, Dalian Maritime University, Dalian, ChinaInformation Science and Technology College, Dalian Maritime University, Dalian, ChinaSchool of Science, Dalian Maritime University, Dalian, ChinaNeuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.https://www.frontiersin.org/articles/10.3389/fgene.2023.1226905/fullneuropeptidesword2vecone-hotstacking strategyconvolution neural network |
spellingShingle | Di Liu Zhengkui Lin Cangzhi Jia NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes Frontiers in Genetics neuropeptides word2vec one-hot stacking strategy convolution neural network |
title | NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes |
title_full | NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes |
title_fullStr | NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes |
title_full_unstemmed | NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes |
title_short | NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes |
title_sort | neurocnn gnb an ensemble model to predict neuropeptides based on a convolution neural network and gaussian naive bayes |
topic | neuropeptides word2vec one-hot stacking strategy convolution neural network |
url | https://www.frontiersin.org/articles/10.3389/fgene.2023.1226905/full |
work_keys_str_mv | AT diliu neurocnngnbanensemblemodeltopredictneuropeptidesbasedonaconvolutionneuralnetworkandgaussiannaivebayes AT zhengkuilin neurocnngnbanensemblemodeltopredictneuropeptidesbasedonaconvolutionneuralnetworkandgaussiannaivebayes AT cangzhijia neurocnngnbanensemblemodeltopredictneuropeptidesbasedonaconvolutionneuralnetworkandgaussiannaivebayes |