An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network
Cancer diagnosis and treatment depend on accurate cancer-type prediction. A prediction model can infer significant cancer features (genes). Gene expression is among the most frequently used features in cancer detection. Deep Learning (DL) architectures, which demonstrate cutting-edge performance in...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-10-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/19/10919 |
_version_ | 1797576214447128576 |
---|---|
author | Saleh N. Almuayqil Murtada K. Elbashir Mohamed Ezz Mohanad Mohammed Ayman Mohamed Mostafa Meshrif Alruily Eslam Hamouda |
author_facet | Saleh N. Almuayqil Murtada K. Elbashir Mohamed Ezz Mohanad Mohammed Ayman Mohamed Mostafa Meshrif Alruily Eslam Hamouda |
author_sort | Saleh N. Almuayqil |
collection | DOAJ |
description | Cancer diagnosis and treatment depend on accurate cancer-type prediction. A prediction model can infer significant cancer features (genes). Gene expression is among the most frequently used features in cancer detection. Deep Learning (DL) architectures, which demonstrate cutting-edge performance in many disciplines, are not appropriate for the gene expression data since it contains a few samples with thousands of features. This study presents an approach that applies three feature selection techniques (Lasso, Random Forest, and Chi-Square) on gene expression data obtained from Pan-Cancer Atlas through the TCGA Firehose Data using R statistical software version 4.2.2. We calculated the feature importance of each selection method. Then we calculated the mean of the feature importance to determine the threshold for selecting the most relevant features. We constructed five models with a simple convolutional neural networks (CNNs) architecture, which are trained using the selected features and then selected the winning model. The winning model achieved a precision of 94.11%, a recall of 94.26%, an F1-score of 94.14%, and an accuracy of 96.16% on a test set. |
first_indexed | 2024-03-10T21:49:05Z |
format | Article |
id | doaj.art-e698e3b04bef4463bfe0bcd872543164 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T21:49:05Z |
publishDate | 2023-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-e698e3b04bef4463bfe0bcd8725431642023-11-19T14:06:03ZengMDPI AGApplied Sciences2076-34172023-10-0113191091910.3390/app131910919An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural NetworkSaleh N. Almuayqil0Murtada K. Elbashir1Mohamed Ezz2Mohanad Mohammed3Ayman Mohamed Mostafa4Meshrif Alruily5Eslam Hamouda6Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaDepartment of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaDepartment of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaSchool of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, Private Bag X01, Scottsville 3209, South AfricaDepartment of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaDepartment of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaDepartment of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi ArabiaCancer diagnosis and treatment depend on accurate cancer-type prediction. A prediction model can infer significant cancer features (genes). Gene expression is among the most frequently used features in cancer detection. Deep Learning (DL) architectures, which demonstrate cutting-edge performance in many disciplines, are not appropriate for the gene expression data since it contains a few samples with thousands of features. This study presents an approach that applies three feature selection techniques (Lasso, Random Forest, and Chi-Square) on gene expression data obtained from Pan-Cancer Atlas through the TCGA Firehose Data using R statistical software version 4.2.2. We calculated the feature importance of each selection method. Then we calculated the mean of the feature importance to determine the threshold for selecting the most relevant features. We constructed five models with a simple convolutional neural networks (CNNs) architecture, which are trained using the selected features and then selected the winning model. The winning model achieved a precision of 94.11%, a recall of 94.26%, an F1-score of 94.14%, and an accuracy of 96.16% on a test set.https://www.mdpi.com/2076-3417/13/19/10919cancer predictiongene expressiondeep learningPan-Cancer Atlasconvolutional neural networks |
spellingShingle | Saleh N. Almuayqil Murtada K. Elbashir Mohamed Ezz Mohanad Mohammed Ayman Mohamed Mostafa Meshrif Alruily Eslam Hamouda An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network Applied Sciences cancer prediction gene expression deep learning Pan-Cancer Atlas convolutional neural networks |
title | An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network |
title_full | An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network |
title_fullStr | An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network |
title_full_unstemmed | An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network |
title_short | An Approach for Cancer-Type Classification Using Feature Selection Techniques with Convolutional Neural Network |
title_sort | approach for cancer type classification using feature selection techniques with convolutional neural network |
topic | cancer prediction gene expression deep learning Pan-Cancer Atlas convolutional neural networks |
url | https://www.mdpi.com/2076-3417/13/19/10919 |
work_keys_str_mv | AT salehnalmuayqil anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT murtadakelbashir anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT mohamedezz anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT mohanadmohammed anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT aymanmohamedmostafa anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT meshrifalruily anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT eslamhamouda anapproachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT salehnalmuayqil approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT murtadakelbashir approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT mohamedezz approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT mohanadmohammed approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT aymanmohamedmostafa approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT meshrifalruily approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork AT eslamhamouda approachforcancertypeclassificationusingfeatureselectiontechniqueswithconvolutionalneuralnetwork |