A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure

Nowadays, data dimensions have increased depending on the developments in information and measurement technologies. Due to the high dimensionality, it is necessary to use pre-analysis data reduction methods for many analyzes such as classification and regression analysis. In the solution of high-dim...

Full description

Bibliographic Details
Main Author: Serkan Akogul
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10286827/
_version_ 1797649801903341568
author Serkan Akogul
author_facet Serkan Akogul
author_sort Serkan Akogul
collection DOAJ
description Nowadays, data dimensions have increased depending on the developments in information and measurement technologies. Due to the high dimensionality, it is necessary to use pre-analysis data reduction methods for many analyzes such as classification and regression analysis. In the solution of high-dimensionality, filter feature selection methods based on statistical criteria are widely used in terms of simplicity and efficiency. One of the important problems with filter feature selection methods is the selection of multiple features carrying the same information unnecessarily when strong correlations exist between features. In this study, a novel approach is proposed to solve this problem of filter feature selection methods. In addition, with the proposed new approach, the question of how many appropriate features will be included is also solved. The performance of the proposed approach is demonstrated on high-dimensional reflectance data with high correlations between features. The results obtained revealed that the proposed approach improves the classification performance of filter feature selection methods in mixture discriminant analysis in terms of classification accuracy and entropy criteria.
first_indexed 2024-03-11T15:51:15Z
format Article
id doaj.art-b4a7e79d0b404d2fa0d52c4ee9c4fa06
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-11T15:51:15Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-b4a7e79d0b404d2fa0d52c4ee9c4fa062023-10-25T23:00:20ZengIEEEIEEE Access2169-35362023-01-011111502511503210.1109/ACCESS.2023.332533110286827A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation StructureSerkan Akogul0https://orcid.org/0000-0002-0346-4308Department of Statistics, Faculty of Science, Pamukkale University, Denizli, TurkeyNowadays, data dimensions have increased depending on the developments in information and measurement technologies. Due to the high dimensionality, it is necessary to use pre-analysis data reduction methods for many analyzes such as classification and regression analysis. In the solution of high-dimensionality, filter feature selection methods based on statistical criteria are widely used in terms of simplicity and efficiency. One of the important problems with filter feature selection methods is the selection of multiple features carrying the same information unnecessarily when strong correlations exist between features. In this study, a novel approach is proposed to solve this problem of filter feature selection methods. In addition, with the proposed new approach, the question of how many appropriate features will be included is also solved. The performance of the proposed approach is demonstrated on high-dimensional reflectance data with high correlations between features. The results obtained revealed that the proposed approach improves the classification performance of filter feature selection methods in mixture discriminant analysis in terms of classification accuracy and entropy criteria.https://ieeexplore.ieee.org/document/10286827/Feature selectionfilter feature selectionGaussian mixture model (GMM)Gaussian mixture discriminant analysis (GMDA)
spellingShingle Serkan Akogul
A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure
IEEE Access
Feature selection
filter feature selection
Gaussian mixture model (GMM)
Gaussian mixture discriminant analysis (GMDA)
title A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure
title_full A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure
title_fullStr A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure
title_full_unstemmed A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure
title_short A Novel Approach to Increase the Efficiency of Filter-Based Feature Selection Methods in High-Dimensional Datasets With Strong Correlation Structure
title_sort novel approach to increase the efficiency of filter based feature selection methods in high dimensional datasets with strong correlation structure
topic Feature selection
filter feature selection
Gaussian mixture model (GMM)
Gaussian mixture discriminant analysis (GMDA)
url https://ieeexplore.ieee.org/document/10286827/
work_keys_str_mv AT serkanakogul anovelapproachtoincreasetheefficiencyoffilterbasedfeatureselectionmethodsinhighdimensionaldatasetswithstrongcorrelationstructure
AT serkanakogul novelapproachtoincreasetheefficiencyoffilterbasedfeatureselectionmethodsinhighdimensionaldatasetswithstrongcorrelationstructure