A Fuzzy-Based Fast Feature Selection Using Divide and Conquer Technique in Huge Dimension Dataset
Feature selection is commonly employed for identifying the top n features that significantly contribute to the desired prediction, for example, to find the top 50 or 100 genes responsible for lung or kidney cancer out of 50,000 genes. Thus, it is a huge time- and resource-consuming practice. In this...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-02-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/11/4/920 |
_version_ | 1797619538352668672 |
---|---|
author | Arihant Tanwar Wajdi Alghamdi Mohammad D. Alahmadi Harpreet Singh Prashant Singh Rana |
author_facet | Arihant Tanwar Wajdi Alghamdi Mohammad D. Alahmadi Harpreet Singh Prashant Singh Rana |
author_sort | Arihant Tanwar |
collection | DOAJ |
description | Feature selection is commonly employed for identifying the top n features that significantly contribute to the desired prediction, for example, to find the top 50 or 100 genes responsible for lung or kidney cancer out of 50,000 genes. Thus, it is a huge time- and resource-consuming practice. In this work, we propose a divide-and-conquer technique with fuzzy backward feature elimination (FBFE) that helps to find the important features quickly and accurately. To show the robustness of the proposed method, it is applied to eight different datasets taken from the NCBI database. We compare the proposed method with seven state-of-the-art feature selection methods and find that the proposed method can obtain fast and better classification accuracy. The proposed method will work for qualitative, quantitative, continuous, and discrete datasets. A web service is developed for researchers and academicians to select top n features. |
first_indexed | 2024-03-11T08:28:26Z |
format | Article |
id | doaj.art-769eb0be0a62418e87f070739573b038 |
institution | Directory Open Access Journal |
issn | 2227-7390 |
language | English |
last_indexed | 2024-03-11T08:28:26Z |
publishDate | 2023-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Mathematics |
spelling | doaj.art-769eb0be0a62418e87f070739573b0382023-11-16T21:55:50ZengMDPI AGMathematics2227-73902023-02-0111492010.3390/math11040920A Fuzzy-Based Fast Feature Selection Using Divide and Conquer Technique in Huge Dimension DatasetArihant Tanwar0Wajdi Alghamdi1Mohammad D. Alahmadi2Harpreet Singh3Prashant Singh Rana4Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala 147004, Punjab, IndiaDepartment of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi ArabiaDepartment of Software Engineering, College of Computer Science and Engineering, University of Jeddah, Jeddah 21959, Saudi ArabiaDepartment of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala 147004, Punjab, IndiaDepartment of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala 147004, Punjab, IndiaFeature selection is commonly employed for identifying the top n features that significantly contribute to the desired prediction, for example, to find the top 50 or 100 genes responsible for lung or kidney cancer out of 50,000 genes. Thus, it is a huge time- and resource-consuming practice. In this work, we propose a divide-and-conquer technique with fuzzy backward feature elimination (FBFE) that helps to find the important features quickly and accurately. To show the robustness of the proposed method, it is applied to eight different datasets taken from the NCBI database. We compare the proposed method with seven state-of-the-art feature selection methods and find that the proposed method can obtain fast and better classification accuracy. The proposed method will work for qualitative, quantitative, continuous, and discrete datasets. A web service is developed for researchers and academicians to select top n features.https://www.mdpi.com/2227-7390/11/4/920feature selectiondivide-and-conquer techniquehuge dimension datasetgenomic datasetfuzzy techniquefuzzy backward feature elimination |
spellingShingle | Arihant Tanwar Wajdi Alghamdi Mohammad D. Alahmadi Harpreet Singh Prashant Singh Rana A Fuzzy-Based Fast Feature Selection Using Divide and Conquer Technique in Huge Dimension Dataset Mathematics feature selection divide-and-conquer technique huge dimension dataset genomic dataset fuzzy technique fuzzy backward feature elimination |
title | A Fuzzy-Based Fast Feature Selection Using Divide and Conquer Technique in Huge Dimension Dataset |
title_full | A Fuzzy-Based Fast Feature Selection Using Divide and Conquer Technique in Huge Dimension Dataset |
title_fullStr | A Fuzzy-Based Fast Feature Selection Using Divide and Conquer Technique in Huge Dimension Dataset |
title_full_unstemmed | A Fuzzy-Based Fast Feature Selection Using Divide and Conquer Technique in Huge Dimension Dataset |
title_short | A Fuzzy-Based Fast Feature Selection Using Divide and Conquer Technique in Huge Dimension Dataset |
title_sort | fuzzy based fast feature selection using divide and conquer technique in huge dimension dataset |
topic | feature selection divide-and-conquer technique huge dimension dataset genomic dataset fuzzy technique fuzzy backward feature elimination |
url | https://www.mdpi.com/2227-7390/11/4/920 |
work_keys_str_mv | AT arihanttanwar afuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT wajdialghamdi afuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT mohammaddalahmadi afuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT harpreetsingh afuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT prashantsinghrana afuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT arihanttanwar fuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT wajdialghamdi fuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT mohammaddalahmadi fuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT harpreetsingh fuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset AT prashantsinghrana fuzzybasedfastfeatureselectionusingdivideandconquertechniqueinhugedimensiondataset |