A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection
Abstract Cancer prediction in the early stage is a topic of major interest in medicine since it allows accurate and efficient actions for successful medical treatments of cancer. Mostly cancer datasets contain various gene expression levels as features with less samples, so firstly there is a need t...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2023-12-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-023-05605-5 |
_version_ | 1797388039868121088 |
---|---|
author | Rajul Mahto Saboor Uddin Ahmed Rizwan ur Rahman Rabia Musheer Aziz Priyanka Roy Saurav Mallik Aimin Li Mohd Asif Shah |
author_facet | Rajul Mahto Saboor Uddin Ahmed Rizwan ur Rahman Rabia Musheer Aziz Priyanka Roy Saurav Mallik Aimin Li Mohd Asif Shah |
author_sort | Rajul Mahto |
collection | DOAJ |
description | Abstract Cancer prediction in the early stage is a topic of major interest in medicine since it allows accurate and efficient actions for successful medical treatments of cancer. Mostly cancer datasets contain various gene expression levels as features with less samples, so firstly there is a need to eliminate similar features to permit faster convergence rate of classification algorithms. These features (genes) enable us to identify cancer disease, choose the best prescription to prevent cancer and discover deviations amid different techniques. To resolve this problem, we proposed a hybrid novel technique CSSMO-based gene selection for cancer classification. First, we made alteration of the fitness of spider monkey optimization (SMO) with cuckoo search algorithm (CSA) algorithm viz., CSSMO for feature selection, which helps to combine the benefit of both metaheuristic algorithms to discover a subset of genes which helps to predict a cancer disease in early stage. Further, to enhance the accuracy of the CSSMO algorithm, we choose a cleaning process, minimum redundancy maximum relevance (mRMR) to lessen the gene expression of cancer datasets. Next, these subsets of genes are classified using deep learning (DL) to identify different groups or classes related to a particular cancer disease. Eight different benchmark microarray gene expression datasets of cancer have been utilized to analyze the performance of the proposed approach with different evaluation matrix such as recall, precision, F1-score, and confusion matrix. The proposed gene selection method with DL achieves much better classification accuracy than other existing DL and machine learning classification models with all large gene expression dataset of cancer. |
first_indexed | 2024-03-08T22:33:58Z |
format | Article |
id | doaj.art-06c4f17adba2461a869c22ab98dcbf22 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-03-08T22:33:58Z |
publishDate | 2023-12-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-06c4f17adba2461a869c22ab98dcbf222023-12-17T12:31:48ZengBMCBMC Bioinformatics1471-21052023-12-0124112610.1186/s12859-023-05605-5A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selectionRajul Mahto0Saboor Uddin Ahmed1Rizwan ur Rahman2Rabia Musheer Aziz3Priyanka Roy4Saurav Mallik5Aimin Li6Mohd Asif Shah7School of Computing Science and Engineering, VIT Bhopal UniversitySchool of Computing Science and Engineering, VIT Bhopal UniversitySchool of Computing Science and Engineering, VIT Bhopal UniversitySchool of Advanced Sciences and Language, VIT Bhopal UniversitySchool of Advanced Sciences and Language, VIT Bhopal UniversityMolecular and Integrative Physiological Sciences, Department of Environmental Health, Harvard T. H. Chan School of Public HealthCenter for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at HoustonDepartment of Economics, Kebri Dehar UniversityAbstract Cancer prediction in the early stage is a topic of major interest in medicine since it allows accurate and efficient actions for successful medical treatments of cancer. Mostly cancer datasets contain various gene expression levels as features with less samples, so firstly there is a need to eliminate similar features to permit faster convergence rate of classification algorithms. These features (genes) enable us to identify cancer disease, choose the best prescription to prevent cancer and discover deviations amid different techniques. To resolve this problem, we proposed a hybrid novel technique CSSMO-based gene selection for cancer classification. First, we made alteration of the fitness of spider monkey optimization (SMO) with cuckoo search algorithm (CSA) algorithm viz., CSSMO for feature selection, which helps to combine the benefit of both metaheuristic algorithms to discover a subset of genes which helps to predict a cancer disease in early stage. Further, to enhance the accuracy of the CSSMO algorithm, we choose a cleaning process, minimum redundancy maximum relevance (mRMR) to lessen the gene expression of cancer datasets. Next, these subsets of genes are classified using deep learning (DL) to identify different groups or classes related to a particular cancer disease. Eight different benchmark microarray gene expression datasets of cancer have been utilized to analyze the performance of the proposed approach with different evaluation matrix such as recall, precision, F1-score, and confusion matrix. The proposed gene selection method with DL achieves much better classification accuracy than other existing DL and machine learning classification models with all large gene expression dataset of cancer.https://doi.org/10.1186/s12859-023-05605-5Deep learning (DL)Cuckoo search algorithm (CSA)Spider monkey optimization (SM)Minimum redundancy maximum relevance (mRMR)Cancer classification |
spellingShingle | Rajul Mahto Saboor Uddin Ahmed Rizwan ur Rahman Rabia Musheer Aziz Priyanka Roy Saurav Mallik Aimin Li Mohd Asif Shah A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection BMC Bioinformatics Deep learning (DL) Cuckoo search algorithm (CSA) Spider monkey optimization (SM) Minimum redundancy maximum relevance (mRMR) Cancer classification |
title | A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection |
title_full | A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection |
title_fullStr | A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection |
title_full_unstemmed | A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection |
title_short | A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection |
title_sort | novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection |
topic | Deep learning (DL) Cuckoo search algorithm (CSA) Spider monkey optimization (SM) Minimum redundancy maximum relevance (mRMR) Cancer classification |
url | https://doi.org/10.1186/s12859-023-05605-5 |
work_keys_str_mv | AT rajulmahto anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT sabooruddinahmed anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT rizwanurrahman anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT rabiamusheeraziz anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT priyankaroy anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT sauravmallik anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT aiminli anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT mohdasifshah anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT rajulmahto novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT sabooruddinahmed novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT rizwanurrahman novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT rabiamusheeraziz novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT priyankaroy novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT sauravmallik novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT aiminli novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection AT mohdasifshah novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection |