Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations
Genes are composed of DNA and each gene has a specific sequence. Recombination or replication within the gene base ends in a permanent change in the nucleotide collection in a DNA called mutation and some mutations can lead to cancer. Breast adenocarcinoma starts in secretary cells. Breast adenocarc...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-09-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/1422-0067/23/19/11539 |
_version_ | 1797478958902542336 |
---|---|
author | Asghar Ali Shah Fahad Alturise Tamim Alkhalifah Yaser Daanial Khan |
author_facet | Asghar Ali Shah Fahad Alturise Tamim Alkhalifah Yaser Daanial Khan |
author_sort | Asghar Ali Shah |
collection | DOAJ |
description | Genes are composed of DNA and each gene has a specific sequence. Recombination or replication within the gene base ends in a permanent change in the nucleotide collection in a DNA called mutation and some mutations can lead to cancer. Breast adenocarcinoma starts in secretary cells. Breast adenocarcinoma is the most common of all cancers that occur in women. According to a survey within the United States of America, there are more than 282,000 breast adenocarcinoma patients registered each 12 months, and most of them are women. Recognition of cancer in its early stages saves many lives. A proposed framework is developed for the early detection of breast adenocarcinoma using an ensemble learning technique with multiple deep learning algorithms, specifically: Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Bi-directional LSTM. There are 99 types of driver genes involved in breast adenocarcinoma. This study uses a dataset of 4127 samples including men and women taken from more than 12 cohorts of cancer detection institutes. The dataset encompasses a total of 6170 mutations that occur in 99 genes. On these gene sequences, different algorithms are applied for feature extraction. Three types of testing techniques including independent set testing, self-consistency testing, and a 10-fold cross-validation test is applied to validate and test the learning approaches. Subsequently, multiple deep learning approaches such as LSTM, GRU, and bi-directional LSTM algorithms are applied. Several evaluation metrics are enumerated for the validation of results including accuracy, sensitivity, specificity, Mathew’s correlation coefficient, area under the curve, training loss, precision, recall, F1 score, and Cohen’s kappa while the values obtained are 99.57, 99.50, 99.63, 0.99, 1.0, 0.2027, 99.57, 99.57, 99.57, and 99.14 respectively. |
first_indexed | 2024-03-09T21:39:02Z |
format | Article |
id | doaj.art-484135aeddaf4aa1aa72610be8f169e7 |
institution | Directory Open Access Journal |
issn | 1661-6596 1422-0067 |
language | English |
last_indexed | 2024-03-09T21:39:02Z |
publishDate | 2022-09-01 |
publisher | MDPI AG |
record_format | Article |
series | International Journal of Molecular Sciences |
spelling | doaj.art-484135aeddaf4aa1aa72610be8f169e72023-11-23T20:34:59ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672022-09-0123191153910.3390/ijms231911539Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic MutationsAsghar Ali Shah0Fahad Alturise1Tamim Alkhalifah2Yaser Daanial Khan3Department of Computer Science, University of Management and Technology, Lahore 54770, PakistanDepartment of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass 58892, Qassim, Saudi ArabiaDepartment of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass 58892, Qassim, Saudi ArabiaDepartment of Computer Science, University of Management and Technology, Lahore 54770, PakistanGenes are composed of DNA and each gene has a specific sequence. Recombination or replication within the gene base ends in a permanent change in the nucleotide collection in a DNA called mutation and some mutations can lead to cancer. Breast adenocarcinoma starts in secretary cells. Breast adenocarcinoma is the most common of all cancers that occur in women. According to a survey within the United States of America, there are more than 282,000 breast adenocarcinoma patients registered each 12 months, and most of them are women. Recognition of cancer in its early stages saves many lives. A proposed framework is developed for the early detection of breast adenocarcinoma using an ensemble learning technique with multiple deep learning algorithms, specifically: Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Bi-directional LSTM. There are 99 types of driver genes involved in breast adenocarcinoma. This study uses a dataset of 4127 samples including men and women taken from more than 12 cohorts of cancer detection institutes. The dataset encompasses a total of 6170 mutations that occur in 99 genes. On these gene sequences, different algorithms are applied for feature extraction. Three types of testing techniques including independent set testing, self-consistency testing, and a 10-fold cross-validation test is applied to validate and test the learning approaches. Subsequently, multiple deep learning approaches such as LSTM, GRU, and bi-directional LSTM algorithms are applied. Several evaluation metrics are enumerated for the validation of results including accuracy, sensitivity, specificity, Mathew’s correlation coefficient, area under the curve, training loss, precision, recall, F1 score, and Cohen’s kappa while the values obtained are 99.57, 99.50, 99.63, 0.99, 1.0, 0.2027, 99.57, 99.57, 99.57, and 99.14 respectively.https://www.mdpi.com/1422-0067/23/19/11539breast adenocarcinomalong short-term memory (LSTM) networkgated recurrent units (GRU)bi-directional LSTMmutation detection |
spellingShingle | Asghar Ali Shah Fahad Alturise Tamim Alkhalifah Yaser Daanial Khan Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations International Journal of Molecular Sciences breast adenocarcinoma long short-term memory (LSTM) network gated recurrent units (GRU) bi-directional LSTM mutation detection |
title | Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations |
title_full | Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations |
title_fullStr | Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations |
title_full_unstemmed | Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations |
title_short | Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations |
title_sort | deep learning approaches for detection of breast adenocarcinoma causing carcinogenic mutations |
topic | breast adenocarcinoma long short-term memory (LSTM) network gated recurrent units (GRU) bi-directional LSTM mutation detection |
url | https://www.mdpi.com/1422-0067/23/19/11539 |
work_keys_str_mv | AT asgharalishah deeplearningapproachesfordetectionofbreastadenocarcinomacausingcarcinogenicmutations AT fahadalturise deeplearningapproachesfordetectionofbreastadenocarcinomacausingcarcinogenicmutations AT tamimalkhalifah deeplearningapproachesfordetectionofbreastadenocarcinomacausingcarcinogenicmutations AT yaserdaanialkhan deeplearningapproachesfordetectionofbreastadenocarcinomacausingcarcinogenicmutations |