A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure
Nowadays, data dimensions have increased depending on the developments in information and measurement technologies. Due to the high dimensionality, it is necessary to use pre-analysis data reduction methods for many analyzes such as classification and regression analysis. In the solution of high-dim...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10286827/ |
_version_ | 1797649801903341568 |
---|---|
author | Serkan Akogul |
author_facet | Serkan Akogul |
author_sort | Serkan Akogul |
collection | DOAJ |
description | Nowadays, data dimensions have increased depending on the developments in information and measurement technologies. Due to the high dimensionality, it is necessary to use pre-analysis data reduction methods for many analyzes such as classification and regression analysis. In the solution of high-dimensionality, filter feature selection methods based on statistical criteria are widely used in terms of simplicity and efficiency. One of the important problems with filter feature selection methods is the selection of multiple features carrying the same information unnecessarily when strong correlations exist between features. In this study, a novel approach is proposed to solve this problem of filter feature selection methods. In addition, with the proposed new approach, the question of how many appropriate features will be included is also solved. The performance of the proposed approach is demonstrated on high-dimensional reflectance data with high correlations between features. The results obtained revealed that the proposed approach improves the classification performance of filter feature selection methods in mixture discriminant analysis in terms of classification accuracy and entropy criteria. |
first_indexed | 2024-03-11T15:51:15Z |
format | Article |
id | doaj.art-b4a7e79d0b404d2fa0d52c4ee9c4fa06 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-11T15:51:15Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-b4a7e79d0b404d2fa0d52c4ee9c4fa062023-10-25T23:00:20ZengIEEEIEEE Access2169-35362023-01-011111502511503210.1109/ACCESS.2023.332533110286827A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation StructureSerkan Akogul0https://orcid.org/0000-0002-0346-4308Department of Statistics, Faculty of Science, Pamukkale University, Denizli, TurkeyNowadays, data dimensions have increased depending on the developments in information and measurement technologies. Due to the high dimensionality, it is necessary to use pre-analysis data reduction methods for many analyzes such as classification and regression analysis. In the solution of high-dimensionality, filter feature selection methods based on statistical criteria are widely used in terms of simplicity and efficiency. One of the important problems with filter feature selection methods is the selection of multiple features carrying the same information unnecessarily when strong correlations exist between features. In this study, a novel approach is proposed to solve this problem of filter feature selection methods. In addition, with the proposed new approach, the question of how many appropriate features will be included is also solved. The performance of the proposed approach is demonstrated on high-dimensional reflectance data with high correlations between features. The results obtained revealed that the proposed approach improves the classification performance of filter feature selection methods in mixture discriminant analysis in terms of classification accuracy and entropy criteria.https://ieeexplore.ieee.org/document/10286827/Feature selectionfilter feature selectionGaussian mixture model (GMM)Gaussian mixture discriminant analysis (GMDA) |
spellingShingle | Serkan Akogul A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure IEEE Access Feature selection filter feature selection Gaussian mixture model (GMM) Gaussian mixture discriminant analysis (GMDA) |
title | A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure |
title_full | A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure |
title_fullStr | A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure |
title_full_unstemmed | A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure |
title_short | A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure |
title_sort | novel approach to increase the efficiency of filter based feature selection methods in high dimensional datasets with strong correlation structure |
topic | Feature selection filter feature selection Gaussian mixture model (GMM) Gaussian mixture discriminant analysis (GMDA) |
url | https://ieeexplore.ieee.org/document/10286827/ |
work_keys_str_mv | AT serkanakogul anovelapproachtoincreasetheefficiencyoffilterbasedfeatureselectionmethodsinhighdimensionaldatasetswithstrongcorrelationstructure AT serkanakogul novelapproachtoincreasetheefficiencyoffilterbasedfeatureselectionmethodsinhighdimensionaldatasetswithstrongcorrelationstructure |