A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data

Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this pur...

Full description

Bibliographic Details
Main Authors: Iqbal Muhammad Zubair, Byunghoon Kim
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9966584/
_version_ 1811208969470869504
author Iqbal Muhammad Zubair
Byunghoon Kim
author_facet Iqbal Muhammad Zubair
Byunghoon Kim
author_sort Iqbal Muhammad Zubair
collection DOAJ
description Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant individual features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the multiple dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the extracted feature by applying random forest and selecting important features that have larger importance scores compared with other ones. In the end, machine-learning algorithms can be used to train and test the models. For the experiment, we compared the proposed with the supervised group lasso (SGL) method by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics.
first_indexed 2024-04-12T04:32:07Z
format Article
id doaj.art-1e98f7ed20bb45b1997b4fa32bc3b5d8
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-12T04:32:07Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-1e98f7ed20bb45b1997b4fa32bc3b5d82022-12-22T03:47:55ZengIEEEIEEE Access2169-35362022-01-011012513612514710.1109/ACCESS.2022.32256859966584A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional DataIqbal Muhammad Zubair0https://orcid.org/0000-0001-8897-4034Byunghoon Kim1https://orcid.org/0000-0002-4377-2292Department of Industrial and Management Engineering, Hanyang University, Ansan, South KoreaDepartment of Industrial and Management Engineering, Hanyang University, Ansan, South KoreaGroup feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant individual features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the multiple dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the extracted feature by applying random forest and selecting important features that have larger importance scores compared with other ones. In the end, machine-learning algorithms can be used to train and test the models. For the experiment, we compared the proposed with the supervised group lasso (SGL) method by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics.https://ieeexplore.ieee.org/document/9966584/Dimension reductionfeature extractiongroup feature rankinggroup feature selectionhigh dimensional data
spellingShingle Iqbal Muhammad Zubair
Byunghoon Kim
A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data
IEEE Access
Dimension reduction
feature extraction
group feature ranking
group feature selection
high dimensional data
title A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data
title_full A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data
title_fullStr A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data
title_full_unstemmed A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data
title_short A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data
title_sort group feature ranking and selection method based on dimension reduction technique in high dimensional data
topic Dimension reduction
feature extraction
group feature ranking
group feature selection
high dimensional data
url https://ieeexplore.ieee.org/document/9966584/
work_keys_str_mv AT iqbalmuhammadzubair agroupfeaturerankingandselectionmethodbasedondimensionreductiontechniqueinhighdimensionaldata
AT byunghoonkim agroupfeaturerankingandselectionmethodbasedondimensionreductiontechniqueinhighdimensionaldata
AT iqbalmuhammadzubair groupfeaturerankingandselectionmethodbasedondimensionreductiontechniqueinhighdimensionaldata
AT byunghoonkim groupfeaturerankingandselectionmethodbasedondimensionreductiontechniqueinhighdimensionaldata