An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study

Computer-aided diagnosis (CAD) systems play a vital role in modern research by effectively minimizing both time and costs. These systems support healthcare professionals like radiologists in their decision-making process by efficiently detecting abnormalities as well as offering accurate and dependa...

Full description

Bibliographic Details
Main Authors: Munish khanna, Law Kumar Singh, Kapil Shrivastava, Rekha singh
Format: Article
Language:English
Published: Elsevier 2024-03-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844024028305
_version_ 1797259835572486144
author Munish khanna
Law Kumar Singh
Kapil Shrivastava
Rekha singh
author_facet Munish khanna
Law Kumar Singh
Kapil Shrivastava
Rekha singh
author_sort Munish khanna
collection DOAJ
description Computer-aided diagnosis (CAD) systems play a vital role in modern research by effectively minimizing both time and costs. These systems support healthcare professionals like radiologists in their decision-making process by efficiently detecting abnormalities as well as offering accurate and dependable information. These systems heavily depend on the efficient selection of features to accurately categorize high-dimensional biological data. These features can subsequently assist in the diagnosis of related medical conditions. The task of identifying patterns in biomedical data can be quite challenging due to the presence of numerous irrelevant or redundant features. Therefore, it is crucial to propose and then utilize a feature selection (FS) process in order to eliminate these features. The primary goal of FS approaches is to improve the accuracy of classification by eliminating features that are irrelevant or less informative. The FS phase plays a critical role in attaining optimal results in machine learning (ML)-driven CAD systems. The effectiveness of ML models can be significantly enhanced by incorporating efficient features during the training phase. This empirical study presents a methodology for the classification of biomedical data using the FS technique. The proposed approach incorporates three soft computing-based optimization algorithms, namely Teaching Learning-Based Optimization (TLBO), Elephant Herding Optimization (EHO), and a proposed hybrid algorithm of these two. These algorithms were previously employed; however, their effectiveness in addressing FS issues in predicting human diseases has not been investigated. The following evaluation focuses on the categorization of benign and malignant tumours using the publicly available Wisconsin Diagnostic Breast Cancer (WDBC) benchmark dataset. The five-fold cross-validation technique is employed to mitigate the risk of over-fitting. The evaluation of the proposed approach's proficiency is determined based on several metrics, including sensitivity, specificity, precision, accuracy, area under the receiver-operating characteristic curve (AUC), and F1-score. The best value of accuracy computed through the suggested approach is 97.96%. The proposed clinical decision support system demonstrates a highly favourable classification performance outcome, making it a valuable tool for medical practitioners to utilize as a secondary opinion and reducing the overburden of expert medical practitioners.
first_indexed 2024-03-07T16:53:39Z
format Article
id doaj.art-028c44e80d7d40c3bb053bd2f7bb2bc7
institution Directory Open Access Journal
issn 2405-8440
language English
last_indexed 2024-04-24T23:15:45Z
publishDate 2024-03-01
publisher Elsevier
record_format Article
series Heliyon
spelling doaj.art-028c44e80d7d40c3bb053bd2f7bb2bc72024-03-17T07:56:18ZengElsevierHeliyon2405-84402024-03-01105e26799An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer studyMunish khanna0Law Kumar Singh1Kapil Shrivastava2Rekha singh3School of Computing Science and Engineering, Galgotias University, Greater Noida, Gautam Buddh Nagar, India; Corresponding author.Department of Computer Engineering and Applications, GLA University, Mathura, IndiaDepartment of Computer Engineering and Applications, GLA University, Mathura, IndiaDepartment of Physics, Uttar Pradesh Rajarshi Tandon Open University, Prayagraj, Uttar Pradesh, IndiaComputer-aided diagnosis (CAD) systems play a vital role in modern research by effectively minimizing both time and costs. These systems support healthcare professionals like radiologists in their decision-making process by efficiently detecting abnormalities as well as offering accurate and dependable information. These systems heavily depend on the efficient selection of features to accurately categorize high-dimensional biological data. These features can subsequently assist in the diagnosis of related medical conditions. The task of identifying patterns in biomedical data can be quite challenging due to the presence of numerous irrelevant or redundant features. Therefore, it is crucial to propose and then utilize a feature selection (FS) process in order to eliminate these features. The primary goal of FS approaches is to improve the accuracy of classification by eliminating features that are irrelevant or less informative. The FS phase plays a critical role in attaining optimal results in machine learning (ML)-driven CAD systems. The effectiveness of ML models can be significantly enhanced by incorporating efficient features during the training phase. This empirical study presents a methodology for the classification of biomedical data using the FS technique. The proposed approach incorporates three soft computing-based optimization algorithms, namely Teaching Learning-Based Optimization (TLBO), Elephant Herding Optimization (EHO), and a proposed hybrid algorithm of these two. These algorithms were previously employed; however, their effectiveness in addressing FS issues in predicting human diseases has not been investigated. The following evaluation focuses on the categorization of benign and malignant tumours using the publicly available Wisconsin Diagnostic Breast Cancer (WDBC) benchmark dataset. The five-fold cross-validation technique is employed to mitigate the risk of over-fitting. The evaluation of the proposed approach's proficiency is determined based on several metrics, including sensitivity, specificity, precision, accuracy, area under the receiver-operating characteristic curve (AUC), and F1-score. The best value of accuracy computed through the suggested approach is 97.96%. The proposed clinical decision support system demonstrates a highly favourable classification performance outcome, making it a valuable tool for medical practitioners to utilize as a secondary opinion and reducing the overburden of expert medical practitioners.http://www.sciencedirect.com/science/article/pii/S2405844024028305Feature selectionBreast cancer predictionMachine learningSoft-computingTeaching learning based optimizationElephant herding optimization
spellingShingle Munish khanna
Law Kumar Singh
Kapil Shrivastava
Rekha singh
An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study
Heliyon
Feature selection
Breast cancer prediction
Machine learning
Soft-computing
Teaching learning based optimization
Elephant herding optimization
title An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study
title_full An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study
title_fullStr An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study
title_full_unstemmed An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study
title_short An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study
title_sort enhanced and efficient approach for feature selection for chronic human disease prediction a breast cancer study
topic Feature selection
Breast cancer prediction
Machine learning
Soft-computing
Teaching learning based optimization
Elephant herding optimization
url http://www.sciencedirect.com/science/article/pii/S2405844024028305
work_keys_str_mv AT munishkhanna anenhancedandefficientapproachforfeatureselectionforchronichumandiseasepredictionabreastcancerstudy
AT lawkumarsingh anenhancedandefficientapproachforfeatureselectionforchronichumandiseasepredictionabreastcancerstudy
AT kapilshrivastava anenhancedandefficientapproachforfeatureselectionforchronichumandiseasepredictionabreastcancerstudy
AT rekhasingh anenhancedandefficientapproachforfeatureselectionforchronichumandiseasepredictionabreastcancerstudy
AT munishkhanna enhancedandefficientapproachforfeatureselectionforchronichumandiseasepredictionabreastcancerstudy
AT lawkumarsingh enhancedandefficientapproachforfeatureselectionforchronichumandiseasepredictionabreastcancerstudy
AT kapilshrivastava enhancedandefficientapproachforfeatureselectionforchronichumandiseasepredictionabreastcancerstudy
AT rekhasingh enhancedandefficientapproachforfeatureselectionforchronichumandiseasepredictionabreastcancerstudy