A filter feature selection for high-dimensional data
In a classification problem, before building a prediction model, it is very important to identify informative features rather than using tens or thousands which may penalize some learning methods and increase the risk of over-fitting. To overcome these problems, the best solution is to use feature s...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2023-07-01
|
Series: | Journal of Algorithms & Computational Technology |
Online Access: | https://doi.org/10.1177/17483026231184171 |
_version_ | 1797785472396689408 |
---|---|
author | Fatima Zahra Janane Tayeb Ouaderhman Hasna Chamlal |
author_facet | Fatima Zahra Janane Tayeb Ouaderhman Hasna Chamlal |
author_sort | Fatima Zahra Janane |
collection | DOAJ |
description | In a classification problem, before building a prediction model, it is very important to identify informative features rather than using tens or thousands which may penalize some learning methods and increase the risk of over-fitting. To overcome these problems, the best solution is to use feature selection. In this article, we propose a new filter method for feature selection, by combining the Relief filter algorithm and the multi-criteria decision-making method called TOPSIS (Technique for Order Preference by Similarity to Ideal Solution), we modeled the feature selection task as a multi-criteria decision problem. Exploiting the Relief methodology, a decision matrix is computed and delivered to Technique for Order Preference by Similarity to Ideal Solution in order to rank the features. The proposed method ends up giving a ranking to the features from the best to the mediocre. To evaluate the performances of the suggested approach, a simulation study including a set of experiments and case studies was conducted on three synthetic dataset scenarios. Finally, the obtained results approve the effectiveness of our proposed filter to detect the best informative features. |
first_indexed | 2024-03-13T00:54:33Z |
format | Article |
id | doaj.art-c7c328b06bc541d28848a66ca9e5e37d |
institution | Directory Open Access Journal |
issn | 1748-3026 |
language | English |
last_indexed | 2024-03-13T00:54:33Z |
publishDate | 2023-07-01 |
publisher | SAGE Publishing |
record_format | Article |
series | Journal of Algorithms & Computational Technology |
spelling | doaj.art-c7c328b06bc541d28848a66ca9e5e37d2023-07-07T07:03:27ZengSAGE PublishingJournal of Algorithms & Computational Technology1748-30262023-07-011710.1177/17483026231184171A filter feature selection for high-dimensional dataFatima Zahra JananeTayeb OuaderhmanHasna ChamlalIn a classification problem, before building a prediction model, it is very important to identify informative features rather than using tens or thousands which may penalize some learning methods and increase the risk of over-fitting. To overcome these problems, the best solution is to use feature selection. In this article, we propose a new filter method for feature selection, by combining the Relief filter algorithm and the multi-criteria decision-making method called TOPSIS (Technique for Order Preference by Similarity to Ideal Solution), we modeled the feature selection task as a multi-criteria decision problem. Exploiting the Relief methodology, a decision matrix is computed and delivered to Technique for Order Preference by Similarity to Ideal Solution in order to rank the features. The proposed method ends up giving a ranking to the features from the best to the mediocre. To evaluate the performances of the suggested approach, a simulation study including a set of experiments and case studies was conducted on three synthetic dataset scenarios. Finally, the obtained results approve the effectiveness of our proposed filter to detect the best informative features.https://doi.org/10.1177/17483026231184171 |
spellingShingle | Fatima Zahra Janane Tayeb Ouaderhman Hasna Chamlal A filter feature selection for high-dimensional data Journal of Algorithms & Computational Technology |
title | A filter feature selection for high-dimensional data |
title_full | A filter feature selection for high-dimensional data |
title_fullStr | A filter feature selection for high-dimensional data |
title_full_unstemmed | A filter feature selection for high-dimensional data |
title_short | A filter feature selection for high-dimensional data |
title_sort | filter feature selection for high dimensional data |
url | https://doi.org/10.1177/17483026231184171 |
work_keys_str_mv | AT fatimazahrajanane afilterfeatureselectionforhighdimensionaldata AT tayebouaderhman afilterfeatureselectionforhighdimensionaldata AT hasnachamlal afilterfeatureselectionforhighdimensionaldata AT fatimazahrajanane filterfeatureselectionforhighdimensionaldata AT tayebouaderhman filterfeatureselectionforhighdimensionaldata AT hasnachamlal filterfeatureselectionforhighdimensionaldata |