Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method

Determining environmental chemical carcinogenicity is urgently needed as humans are increasingly exposed to these chemicals. In this study, we developed a hybrid neural network (HNN) method called HNN-Cancer to predict potential carcinogens of real-life chemicals. The HNN-Cancer included a new SMILE...

Full description

Bibliographic Details
Main Authors: Sarita Limbu, Sivanesan Dakshanamurthy
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/21/8185
_version_ 1797466537680961536
author Sarita Limbu
Sivanesan Dakshanamurthy
author_facet Sarita Limbu
Sivanesan Dakshanamurthy
author_sort Sarita Limbu
collection DOAJ
description Determining environmental chemical carcinogenicity is urgently needed as humans are increasingly exposed to these chemicals. In this study, we developed a hybrid neural network (HNN) method called HNN-Cancer to predict potential carcinogens of real-life chemicals. The HNN-Cancer included a new SMILES feature representation method by modifying our previous 3D array representation of 1D SMILES simulated by the convolutional neural network (CNN). We developed binary classification, multiclass classification, and regression models based on diverse non-congeneric chemicals. Along with the HNN-Cancer model, we developed models based on the random forest (RF), bootstrap aggregating (Bagging), and adaptive boosting (AdaBoost) methods for binary and multiclass classification. We developed regression models using HNN-Cancer, RF, support vector regressor (SVR), gradient boosting (GB), kernel ridge (KR), decision tree with AdaBoost (DT), KNeighbors (KN), and a consensus method. The performance of the models for all classifications was assessed using various statistical metrics. The accuracy of the HNN-Cancer, RF, and Bagging models were 74%, and their AUC was ~0.81 for binary classification models developed with 7994 chemicals. The sensitivity was 79.5% and the specificity was 67.3% for the HNN-Cancer, which outperforms the other methods. In the case of multiclass classification models with 1618 chemicals, we obtained the optimal accuracy of 70% with an AUC 0.7 for HNN-Cancer, RF, Bagging, and AdaBoost, respectively. In the case of regression models, the correlation coefficient (R) was around 0.62 for HNN-Cancer and RF higher than the SVM, GB, KR, DTBoost, and NN machine learning methods. Overall, the HNN-Cancer performed better for the majority of the known carcinogen experimental datasets. Further, the predictive performance of HNN-Cancer on diverse chemicals is comparable to the literature-reported models that included similar and less diverse molecules. Our HNN-Cancer could be used in identifying potentially carcinogenic chemicals for a wide variety of chemical classes.
first_indexed 2024-03-09T18:41:12Z
format Article
id doaj.art-900b68e4b8b04539b579b27d22fb089e
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T18:41:12Z
publishDate 2022-10-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-900b68e4b8b04539b579b27d22fb089e2023-11-24T06:44:01ZengMDPI AGSensors1424-82202022-10-012221818510.3390/s22218185Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning MethodSarita Limbu0Sivanesan Dakshanamurthy1Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Washington, DC 20057, USALombardi Comprehensive Cancer Center, Georgetown University Medical Center, Washington, DC 20057, USADetermining environmental chemical carcinogenicity is urgently needed as humans are increasingly exposed to these chemicals. In this study, we developed a hybrid neural network (HNN) method called HNN-Cancer to predict potential carcinogens of real-life chemicals. The HNN-Cancer included a new SMILES feature representation method by modifying our previous 3D array representation of 1D SMILES simulated by the convolutional neural network (CNN). We developed binary classification, multiclass classification, and regression models based on diverse non-congeneric chemicals. Along with the HNN-Cancer model, we developed models based on the random forest (RF), bootstrap aggregating (Bagging), and adaptive boosting (AdaBoost) methods for binary and multiclass classification. We developed regression models using HNN-Cancer, RF, support vector regressor (SVR), gradient boosting (GB), kernel ridge (KR), decision tree with AdaBoost (DT), KNeighbors (KN), and a consensus method. The performance of the models for all classifications was assessed using various statistical metrics. The accuracy of the HNN-Cancer, RF, and Bagging models were 74%, and their AUC was ~0.81 for binary classification models developed with 7994 chemicals. The sensitivity was 79.5% and the specificity was 67.3% for the HNN-Cancer, which outperforms the other methods. In the case of multiclass classification models with 1618 chemicals, we obtained the optimal accuracy of 70% with an AUC 0.7 for HNN-Cancer, RF, Bagging, and AdaBoost, respectively. In the case of regression models, the correlation coefficient (R) was around 0.62 for HNN-Cancer and RF higher than the SVM, GB, KR, DTBoost, and NN machine learning methods. Overall, the HNN-Cancer performed better for the majority of the known carcinogen experimental datasets. Further, the predictive performance of HNN-Cancer on diverse chemicals is comparable to the literature-reported models that included similar and less diverse molecules. Our HNN-Cancer could be used in identifying potentially carcinogenic chemicals for a wide variety of chemical classes.https://www.mdpi.com/1424-8220/22/21/8185chemical carcinogensmachine learningdeep learning neural networkhybrid neural networkconvolution neural networkfast forward neural network
spellingShingle Sarita Limbu
Sivanesan Dakshanamurthy
Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method
Sensors
chemical carcinogens
machine learning
deep learning neural network
hybrid neural network
convolution neural network
fast forward neural network
title Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method
title_full Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method
title_fullStr Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method
title_full_unstemmed Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method
title_short Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method
title_sort predicting chemical carcinogens using a hybrid neural network deep learning method
topic chemical carcinogens
machine learning
deep learning neural network
hybrid neural network
convolution neural network
fast forward neural network
url https://www.mdpi.com/1424-8220/22/21/8185
work_keys_str_mv AT saritalimbu predictingchemicalcarcinogensusingahybridneuralnetworkdeeplearningmethod
AT sivanesandakshanamurthy predictingchemicalcarcinogensusingahybridneuralnetworkdeeplearningmethod