An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network

Cancer diagnosis and treatment depend on accurate cancer-type prediction. A prediction model can infer significant cancer features (genes). Gene expression is among the most frequently used features in cancer detection. Deep Learning (DL) architectures, which demonstrate cutting-edge performance in...

Full description

Bibliographic Details
Main Authors: Saleh N. Almuayqil, Murtada K. Elbashir, Mohamed Ezz, Mohanad Mohammed, Ayman Mohamed Mostafa, Meshrif Alruily, Eslam Hamouda
Format: Article
Language:English
Published: MDPI AG 2023-10-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/19/10919
_version_ 1797576214447128576
author Saleh N. Almuayqil
Murtada K. Elbashir
Mohamed Ezz
Mohanad Mohammed
Ayman Mohamed Mostafa
Meshrif Alruily
Eslam Hamouda
author_facet Saleh N. Almuayqil
Murtada K. Elbashir
Mohamed Ezz
Mohanad Mohammed
Ayman Mohamed Mostafa
Meshrif Alruily
Eslam Hamouda
author_sort Saleh N. Almuayqil
collection DOAJ
description Cancer diagnosis and treatment depend on accurate cancer-type prediction. A prediction model can infer significant cancer features (genes). Gene expression is among the most frequently used features in cancer detection. Deep Learning (DL) architectures, which demonstrate cutting-edge performance in many disciplines, are not appropriate for the gene expression data since it contains a few samples with thousands of features. This study presents an approach that applies three feature selection techniques (Lasso, Random Forest, and Chi-Square) on gene expression data obtained from Pan-Cancer Atlas through the TCGA Firehose Data using R statistical software version 4.2.2. We calculated the feature importance of each selection method. Then we calculated the mean of the feature importance to determine the threshold for selecting the most relevant features. We constructed five models with a simple convolutional neural networks (CNNs) architecture, which are trained using the selected features and then selected the winning model. The winning model achieved a precision of 94.11%, a recall of 94.26%, an F1-score of 94.14%, and an accuracy of 96.16% on a test set.
first_indexed 2024-03-10T21:49:05Z
format Article
id doaj.art-e698e3b04bef4463bfe0bcd872543164
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T21:49:05Z
publishDate 2023-10-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-e698e3b04bef4463bfe0bcd8725431642023-11-19T14:06:03ZengMDPI AGApplied Sciences2076-34172023-10-0113191091910.3390/app131910919An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural NetworkSaleh N. Almuayqil0Murtada K. Elbashir1Mohamed Ezz2Mohanad Mohammed3Ayman Mohamed Mostafa4Meshrif Alruily5Eslam Hamouda6Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaDepartment of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaDepartment of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaSchool of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, Private Bag X01, Scottsville 3209, South AfricaDepartment of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaDepartment of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaDepartment of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaCancer diagnosis and treatment depend on accurate cancer-type prediction. A prediction model can infer significant cancer features (genes). Gene expression is among the most frequently used features in cancer detection. Deep Learning (DL) architectures, which demonstrate cutting-edge performance in many disciplines, are not appropriate for the gene expression data since it contains a few samples with thousands of features. This study presents an approach that applies three feature selection techniques (Lasso, Random Forest, and Chi-Square) on gene expression data obtained from Pan-Cancer Atlas through the TCGA Firehose Data using R statistical software version 4.2.2. We calculated the feature importance of each selection method. Then we calculated the mean of the feature importance to determine the threshold for selecting the most relevant features. We constructed five models with a simple convolutional neural networks (CNNs) architecture, which are trained using the selected features and then selected the winning model. The winning model achieved a precision of 94.11%, a recall of 94.26%, an F1-score of 94.14%, and an accuracy of 96.16% on a test set.https://www.mdpi.com/2076-3417/13/19/10919cancer predictiongene expressiondeep learningPan-Cancer Atlasconvolutional neural networks
spellingShingle Saleh N. Almuayqil
Murtada K. Elbashir
Mohamed Ezz
Mohanad Mohammed
Ayman Mohamed Mostafa
Meshrif Alruily
Eslam Hamouda
An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network
Applied Sciences
cancer prediction
gene expression
deep learning
Pan-Cancer Atlas
convolutional neural networks
title An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network
title_full An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network
title_fullStr An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network
title_full_unstemmed An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network
title_short An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network
title_sort approach for cancer type classification using feature selection techniques with convolutional neural network
topic cancer prediction
gene expression
deep learning
Pan-Cancer Atlas
convolutional neural networks
url https://www.mdpi.com/2076-3417/13/19/10919
work_keys_str_mv AT salehnalmuayqil anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT murtadakelbashir anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT mohamedezz anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT mohanadmohammed anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT aymanmohamedmostafa anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT meshrifalruily anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT eslamhamouda anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT salehnalmuayqil approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT murtadakelbashir approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT mohamedezz approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT mohanadmohammed approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT aymanmohamedmostafa approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT meshrifalruily approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork
AT eslamhamouda approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork