A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data
Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this pur...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9966584/ |
_version_ | 1811208969470869504 |
---|---|
author | Iqbal Muhammad Zubair Byunghoon Kim |
author_facet | Iqbal Muhammad Zubair Byunghoon Kim |
author_sort | Iqbal Muhammad Zubair |
collection | DOAJ |
description | Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant individual features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the multiple dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the extracted feature by applying random forest and selecting important features that have larger importance scores compared with other ones. In the end, machine-learning algorithms can be used to train and test the models. For the experiment, we compared the proposed with the supervised group lasso (SGL) method by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics. |
first_indexed | 2024-04-12T04:32:07Z |
format | Article |
id | doaj.art-1e98f7ed20bb45b1997b4fa32bc3b5d8 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-12T04:32:07Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-1e98f7ed20bb45b1997b4fa32bc3b5d82022-12-22T03:47:55ZengIEEEIEEE Access2169-35362022-01-011012513612514710.1109/ACCESS.2022.32256859966584A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional DataIqbal Muhammad Zubair0https://orcid.org/0000-0001-8897-4034Byunghoon Kim1https://orcid.org/0000-0002-4377-2292Department of Industrial and Management Engineering, Hanyang University, Ansan, South KoreaDepartment of Industrial and Management Engineering, Hanyang University, Ansan, South KoreaGroup feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant individual features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the multiple dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the extracted feature by applying random forest and selecting important features that have larger importance scores compared with other ones. In the end, machine-learning algorithms can be used to train and test the models. For the experiment, we compared the proposed with the supervised group lasso (SGL) method by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics.https://ieeexplore.ieee.org/document/9966584/Dimension reductionfeature extractiongroup feature rankinggroup feature selectionhigh dimensional data |
spellingShingle | Iqbal Muhammad Zubair Byunghoon Kim A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data IEEE Access Dimension reduction feature extraction group feature ranking group feature selection high dimensional data |
title | A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data |
title_full | A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data |
title_fullStr | A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data |
title_full_unstemmed | A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data |
title_short | A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data |
title_sort | group feature ranking and selection method based on dimension reduction technique in high dimensional data |
topic | Dimension reduction feature extraction group feature ranking group feature selection high dimensional data |
url | https://ieeexplore.ieee.org/document/9966584/ |
work_keys_str_mv | AT iqbalmuhammadzubair agroupfeaturerankingandselectionmethodbasedondimensionreductiontechniqueinhighdimensionaldata AT byunghoonkim agroupfeaturerankingandselectionmethodbasedondimensionreductiontechniqueinhighdimensionaldata AT iqbalmuhammadzubair groupfeaturerankingandselectionmethodbasedondimensionreductiontechniqueinhighdimensionaldata AT byunghoonkim groupfeaturerankingandselectionmethodbasedondimensionreductiontechniqueinhighdimensionaldata |