A filter feature selection for high-dimensional data

In a classification problem, before building a prediction model, it is very important to identify informative features rather than using tens or thousands which may penalize some learning methods and increase the risk of over-fitting. To overcome these problems, the best solution is to use feature s...

Full description

Bibliographic Details
Main Authors: Fatima Zahra Janane, Tayeb Ouaderhman, Hasna Chamlal
Format: Article
Language:English
Published: SAGE Publishing 2023-07-01
Series:Journal of Algorithms & Computational Technology
Online Access:https://doi.org/10.1177/17483026231184171
_version_ 1797785472396689408
author Fatima Zahra Janane
Tayeb Ouaderhman
Hasna Chamlal
author_facet Fatima Zahra Janane
Tayeb Ouaderhman
Hasna Chamlal
author_sort Fatima Zahra Janane
collection DOAJ
description In a classification problem, before building a prediction model, it is very important to identify informative features rather than using tens or thousands which may penalize some learning methods and increase the risk of over-fitting. To overcome these problems, the best solution is to use feature selection. In this article, we propose a new filter method for feature selection, by combining the Relief filter algorithm and the multi-criteria decision-making method called TOPSIS (Technique for Order Preference by Similarity to Ideal Solution), we modeled the feature selection task as a multi-criteria decision problem. Exploiting the Relief methodology, a decision matrix is computed and delivered to Technique for Order Preference by Similarity to Ideal Solution in order to rank the features. The proposed method ends up giving a ranking to the features from the best to the mediocre. To evaluate the performances of the suggested approach, a simulation study including a set of experiments and case studies was conducted on three synthetic dataset scenarios. Finally, the obtained results approve the effectiveness of our proposed filter to detect the best informative features.
first_indexed 2024-03-13T00:54:33Z
format Article
id doaj.art-c7c328b06bc541d28848a66ca9e5e37d
institution Directory Open Access Journal
issn 1748-3026
language English
last_indexed 2024-03-13T00:54:33Z
publishDate 2023-07-01
publisher SAGE Publishing
record_format Article
series Journal of Algorithms & Computational Technology
spelling doaj.art-c7c328b06bc541d28848a66ca9e5e37d2023-07-07T07:03:27ZengSAGE PublishingJournal of Algorithms & Computational Technology1748-30262023-07-011710.1177/17483026231184171A filter feature selection for high-dimensional dataFatima Zahra JananeTayeb OuaderhmanHasna ChamlalIn a classification problem, before building a prediction model, it is very important to identify informative features rather than using tens or thousands which may penalize some learning methods and increase the risk of over-fitting. To overcome these problems, the best solution is to use feature selection. In this article, we propose a new filter method for feature selection, by combining the Relief filter algorithm and the multi-criteria decision-making method called TOPSIS (Technique for Order Preference by Similarity to Ideal Solution), we modeled the feature selection task as a multi-criteria decision problem. Exploiting the Relief methodology, a decision matrix is computed and delivered to Technique for Order Preference by Similarity to Ideal Solution in order to rank the features. The proposed method ends up giving a ranking to the features from the best to the mediocre. To evaluate the performances of the suggested approach, a simulation study including a set of experiments and case studies was conducted on three synthetic dataset scenarios. Finally, the obtained results approve the effectiveness of our proposed filter to detect the best informative features.https://doi.org/10.1177/17483026231184171
spellingShingle Fatima Zahra Janane
Tayeb Ouaderhman
Hasna Chamlal
A filter feature selection for high-dimensional data
Journal of Algorithms & Computational Technology
title A filter feature selection for high-dimensional data
title_full A filter feature selection for high-dimensional data
title_fullStr A filter feature selection for high-dimensional data
title_full_unstemmed A filter feature selection for high-dimensional data
title_short A filter feature selection for high-dimensional data
title_sort filter feature selection for high dimensional data
url https://doi.org/10.1177/17483026231184171
work_keys_str_mv AT fatimazahrajanane afilterfeatureselectionforhighdimensionaldata
AT tayebouaderhman afilterfeatureselectionforhighdimensionaldata
AT hasnachamlal afilterfeatureselectionforhighdimensionaldata
AT fatimazahrajanane filterfeatureselectionforhighdimensionaldata
AT tayebouaderhman filterfeatureselectionforhighdimensionaldata
AT hasnachamlal filterfeatureselectionforhighdimensionaldata