Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations

Genes are composed of DNA and each gene has a specific sequence. Recombination or replication within the gene base ends in a permanent change in the nucleotide collection in a DNA called mutation and some mutations can lead to cancer. Breast adenocarcinoma starts in secretary cells. Breast adenocarc...

Full description

Bibliographic Details
Main Authors: Asghar Ali Shah, Fahad Alturise, Tamim Alkhalifah, Yaser Daanial Khan
Format: Article
Language:English
Published: MDPI AG 2022-09-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:https://www.mdpi.com/1422-0067/23/19/11539
_version_ 1797478958902542336
author Asghar Ali Shah
Fahad Alturise
Tamim Alkhalifah
Yaser Daanial Khan
author_facet Asghar Ali Shah
Fahad Alturise
Tamim Alkhalifah
Yaser Daanial Khan
author_sort Asghar Ali Shah
collection DOAJ
description Genes are composed of DNA and each gene has a specific sequence. Recombination or replication within the gene base ends in a permanent change in the nucleotide collection in a DNA called mutation and some mutations can lead to cancer. Breast adenocarcinoma starts in secretary cells. Breast adenocarcinoma is the most common of all cancers that occur in women. According to a survey within the United States of America, there are more than 282,000 breast adenocarcinoma patients registered each 12 months, and most of them are women. Recognition of cancer in its early stages saves many lives. A proposed framework is developed for the early detection of breast adenocarcinoma using an ensemble learning technique with multiple deep learning algorithms, specifically: Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Bi-directional LSTM. There are 99 types of driver genes involved in breast adenocarcinoma. This study uses a dataset of 4127 samples including men and women taken from more than 12 cohorts of cancer detection institutes. The dataset encompasses a total of 6170 mutations that occur in 99 genes. On these gene sequences, different algorithms are applied for feature extraction. Three types of testing techniques including independent set testing, self-consistency testing, and a 10-fold cross-validation test is applied to validate and test the learning approaches. Subsequently, multiple deep learning approaches such as LSTM, GRU, and bi-directional LSTM algorithms are applied. Several evaluation metrics are enumerated for the validation of results including accuracy, sensitivity, specificity, Mathew’s correlation coefficient, area under the curve, training loss, precision, recall, F1 score, and Cohen’s kappa while the values obtained are 99.57, 99.50, 99.63, 0.99, 1.0, 0.2027, 99.57, 99.57, 99.57, and 99.14 respectively.
first_indexed 2024-03-09T21:39:02Z
format Article
id doaj.art-484135aeddaf4aa1aa72610be8f169e7
institution Directory Open Access Journal
issn 1661-6596
1422-0067
language English
last_indexed 2024-03-09T21:39:02Z
publishDate 2022-09-01
publisher MDPI AG
record_format Article
series International Journal of Molecular Sciences
spelling doaj.art-484135aeddaf4aa1aa72610be8f169e72023-11-23T20:34:59ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672022-09-0123191153910.3390/ijms231911539Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic MutationsAsghar Ali Shah0Fahad Alturise1Tamim Alkhalifah2Yaser Daanial Khan3Department of Computer Science, University of Management and Technology, Lahore 54770, PakistanDepartment of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass 58892, Qassim, Saudi ArabiaDepartment of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass 58892, Qassim, Saudi ArabiaDepartment of Computer Science, University of Management and Technology, Lahore 54770, PakistanGenes are composed of DNA and each gene has a specific sequence. Recombination or replication within the gene base ends in a permanent change in the nucleotide collection in a DNA called mutation and some mutations can lead to cancer. Breast adenocarcinoma starts in secretary cells. Breast adenocarcinoma is the most common of all cancers that occur in women. According to a survey within the United States of America, there are more than 282,000 breast adenocarcinoma patients registered each 12 months, and most of them are women. Recognition of cancer in its early stages saves many lives. A proposed framework is developed for the early detection of breast adenocarcinoma using an ensemble learning technique with multiple deep learning algorithms, specifically: Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Bi-directional LSTM. There are 99 types of driver genes involved in breast adenocarcinoma. This study uses a dataset of 4127 samples including men and women taken from more than 12 cohorts of cancer detection institutes. The dataset encompasses a total of 6170 mutations that occur in 99 genes. On these gene sequences, different algorithms are applied for feature extraction. Three types of testing techniques including independent set testing, self-consistency testing, and a 10-fold cross-validation test is applied to validate and test the learning approaches. Subsequently, multiple deep learning approaches such as LSTM, GRU, and bi-directional LSTM algorithms are applied. Several evaluation metrics are enumerated for the validation of results including accuracy, sensitivity, specificity, Mathew’s correlation coefficient, area under the curve, training loss, precision, recall, F1 score, and Cohen’s kappa while the values obtained are 99.57, 99.50, 99.63, 0.99, 1.0, 0.2027, 99.57, 99.57, 99.57, and 99.14 respectively.https://www.mdpi.com/1422-0067/23/19/11539breast adenocarcinomalong short-term memory (LSTM) networkgated recurrent units (GRU)bi-directional LSTMmutation detection
spellingShingle Asghar Ali Shah
Fahad Alturise
Tamim Alkhalifah
Yaser Daanial Khan
Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations
International Journal of Molecular Sciences
breast adenocarcinoma
long short-term memory (LSTM) network
gated recurrent units (GRU)
bi-directional LSTM
mutation detection
title Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations
title_full Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations
title_fullStr Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations
title_full_unstemmed Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations
title_short Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations
title_sort deep learning approaches for detection of breast adenocarcinoma causing carcinogenic mutations
topic breast adenocarcinoma
long short-term memory (LSTM) network
gated recurrent units (GRU)
bi-directional LSTM
mutation detection
url https://www.mdpi.com/1422-0067/23/19/11539
work_keys_str_mv AT asgharalishah deeplearningapproachesfordetectionofbreastadenocarcinomacausingcarcinogenicmutations
AT fahadalturise deeplearningapproachesfordetectionofbreastadenocarcinomacausingcarcinogenicmutations
AT tamimalkhalifah deeplearningapproachesfordetectionofbreastadenocarcinomacausingcarcinogenicmutations
AT yaserdaanialkhan deeplearningapproachesfordetectionofbreastadenocarcinomacausingcarcinogenicmutations