A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection

Abstract Cancer prediction in the early stage is a topic of major interest in medicine since it allows accurate and efficient actions for successful medical treatments of cancer. Mostly cancer datasets contain various gene expression levels as features with less samples, so firstly there is a need t...

Full description

Bibliographic Details
Main Authors: Rajul Mahto, Saboor Uddin Ahmed, Rizwan ur Rahman, Rabia Musheer Aziz, Priyanka Roy, Saurav Mallik, Aimin Li, Mohd Asif Shah
Format: Article
Language:English
Published: BMC 2023-12-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-023-05605-5
_version_ 1797388039868121088
author Rajul Mahto
Saboor Uddin Ahmed
Rizwan ur Rahman
Rabia Musheer Aziz
Priyanka Roy
Saurav Mallik
Aimin Li
Mohd Asif Shah
author_facet Rajul Mahto
Saboor Uddin Ahmed
Rizwan ur Rahman
Rabia Musheer Aziz
Priyanka Roy
Saurav Mallik
Aimin Li
Mohd Asif Shah
author_sort Rajul Mahto
collection DOAJ
description Abstract Cancer prediction in the early stage is a topic of major interest in medicine since it allows accurate and efficient actions for successful medical treatments of cancer. Mostly cancer datasets contain various gene expression levels as features with less samples, so firstly there is a need to eliminate similar features to permit faster convergence rate of classification algorithms. These features (genes) enable us to identify cancer disease, choose the best prescription to prevent cancer and discover deviations amid different techniques. To resolve this problem, we proposed a hybrid novel technique CSSMO-based gene selection for cancer classification. First, we made alteration of the fitness of spider monkey optimization (SMO) with cuckoo search algorithm (CSA) algorithm viz., CSSMO for feature selection, which helps to combine the benefit of both metaheuristic algorithms to discover a subset of genes which helps to predict a cancer disease in early stage. Further, to enhance the accuracy of the CSSMO algorithm, we choose a cleaning process, minimum redundancy maximum relevance (mRMR) to lessen the gene expression of cancer datasets. Next, these subsets of genes are classified using deep learning (DL) to identify different groups or classes related to a particular cancer disease. Eight different benchmark microarray gene expression datasets of cancer have been utilized to analyze the performance of the proposed approach with different evaluation matrix such as recall, precision, F1-score, and confusion matrix. The proposed gene selection method with DL achieves much better classification accuracy than other existing DL and machine learning classification models with all large gene expression dataset of cancer.
first_indexed 2024-03-08T22:33:58Z
format Article
id doaj.art-06c4f17adba2461a869c22ab98dcbf22
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-03-08T22:33:58Z
publishDate 2023-12-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-06c4f17adba2461a869c22ab98dcbf222023-12-17T12:31:48ZengBMCBMC Bioinformatics1471-21052023-12-0124112610.1186/s12859-023-05605-5A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selectionRajul Mahto0Saboor Uddin Ahmed1Rizwan ur Rahman2Rabia Musheer Aziz3Priyanka Roy4Saurav Mallik5Aimin Li6Mohd Asif Shah7School of Computing Science and Engineering, VIT Bhopal UniversitySchool of Computing Science and Engineering, VIT Bhopal UniversitySchool of Computing Science and Engineering, VIT Bhopal UniversitySchool of Advanced Sciences and Language, VIT Bhopal UniversitySchool of Advanced Sciences and Language, VIT Bhopal UniversityMolecular and Integrative Physiological Sciences, Department of Environmental Health, Harvard T. H. Chan School of Public HealthCenter for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at HoustonDepartment of Economics, Kebri Dehar UniversityAbstract Cancer prediction in the early stage is a topic of major interest in medicine since it allows accurate and efficient actions for successful medical treatments of cancer. Mostly cancer datasets contain various gene expression levels as features with less samples, so firstly there is a need to eliminate similar features to permit faster convergence rate of classification algorithms. These features (genes) enable us to identify cancer disease, choose the best prescription to prevent cancer and discover deviations amid different techniques. To resolve this problem, we proposed a hybrid novel technique CSSMO-based gene selection for cancer classification. First, we made alteration of the fitness of spider monkey optimization (SMO) with cuckoo search algorithm (CSA) algorithm viz., CSSMO for feature selection, which helps to combine the benefit of both metaheuristic algorithms to discover a subset of genes which helps to predict a cancer disease in early stage. Further, to enhance the accuracy of the CSSMO algorithm, we choose a cleaning process, minimum redundancy maximum relevance (mRMR) to lessen the gene expression of cancer datasets. Next, these subsets of genes are classified using deep learning (DL) to identify different groups or classes related to a particular cancer disease. Eight different benchmark microarray gene expression datasets of cancer have been utilized to analyze the performance of the proposed approach with different evaluation matrix such as recall, precision, F1-score, and confusion matrix. The proposed gene selection method with DL achieves much better classification accuracy than other existing DL and machine learning classification models with all large gene expression dataset of cancer.https://doi.org/10.1186/s12859-023-05605-5Deep learning (DL)Cuckoo search algorithm (CSA)Spider monkey optimization (SM)Minimum redundancy maximum relevance (mRMR)Cancer classification
spellingShingle Rajul Mahto
Saboor Uddin Ahmed
Rizwan ur Rahman
Rabia Musheer Aziz
Priyanka Roy
Saurav Mallik
Aimin Li
Mohd Asif Shah
A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection
BMC Bioinformatics
Deep learning (DL)
Cuckoo search algorithm (CSA)
Spider monkey optimization (SM)
Minimum redundancy maximum relevance (mRMR)
Cancer classification
title A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection
title_full A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection
title_fullStr A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection
title_full_unstemmed A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection
title_short A novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection
title_sort novel and innovative cancer classification framework through a consecutive utilization of hybrid feature selection
topic Deep learning (DL)
Cuckoo search algorithm (CSA)
Spider monkey optimization (SM)
Minimum redundancy maximum relevance (mRMR)
Cancer classification
url https://doi.org/10.1186/s12859-023-05605-5
work_keys_str_mv AT rajulmahto anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT sabooruddinahmed anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT rizwanurrahman anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT rabiamusheeraziz anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT priyankaroy anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT sauravmallik anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT aiminli anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT mohdasifshah anovelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT rajulmahto novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT sabooruddinahmed novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT rizwanurrahman novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT rabiamusheeraziz novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT priyankaroy novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT sauravmallik novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT aiminli novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection
AT mohdasifshah novelandinnovativecancerclassificationframeworkthroughaconsecutiveutilizationofhybridfeatureselection