Feature selection of high dimensional data using Hybrid FSA-IG
Feature selection (FS) is a process of selecting a subset of relevant features depends on the specific target variables especially when dealing with high dimensional dataset. The aim of this paper is to investigate the performance comparison of different feature selection techniques on high dimensio...
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | http://eprints.utm.my/92504/1/NurFatinLiyana2020_FeatureSelectionofHighDimensionalData.pdf |
_version_ | 1796865437756030976 |
---|---|
author | Mohd. Rosely, Nur Fatin Liyana Mohd. Zain, Azlan Yusoff, Yusliza |
author_facet | Mohd. Rosely, Nur Fatin Liyana Mohd. Zain, Azlan Yusoff, Yusliza |
author_sort | Mohd. Rosely, Nur Fatin Liyana |
collection | ePrints |
description | Feature selection (FS) is a process of selecting a subset of relevant features depends on the specific target variables especially when dealing with high dimensional dataset. The aim of this paper is to investigate the performance comparison of different feature selection techniques on high dimensional datasets. The techniques used are filter, wrapper and hybrid. Information gain (IG) represents the filter, Fish Swarm Algorithm (FSA) represents metaheuristics wrapper and Hybrid FSA-IG represents the hybrid technique. Five datasets with different number of features are used in these techniques. The dataset used are breast cancer, lung cancer, ovarian cancer, mixed-lineage leukaemia (MLL) and small round blue cell tumors (SRBCT). The result shown Hybrid FSA-IG managed to select least feature that represent significant feature for every dataset with improved performance of accuracy from 4.868% to 33.402% and 1.706% to 25.154% compared to IG and FSA respectively. |
first_indexed | 2024-03-05T20:56:59Z |
format | Conference or Workshop Item |
id | utm.eprints-92504 |
institution | Universiti Teknologi Malaysia - ePrints |
language | English |
last_indexed | 2024-03-05T20:56:59Z |
publishDate | 2020 |
record_format | dspace |
spelling | utm.eprints-925042021-09-30T15:14:56Z http://eprints.utm.my/92504/ Feature selection of high dimensional data using Hybrid FSA-IG Mohd. Rosely, Nur Fatin Liyana Mohd. Zain, Azlan Yusoff, Yusliza QA75 Electronic computers. Computer science Feature selection (FS) is a process of selecting a subset of relevant features depends on the specific target variables especially when dealing with high dimensional dataset. The aim of this paper is to investigate the performance comparison of different feature selection techniques on high dimensional datasets. The techniques used are filter, wrapper and hybrid. Information gain (IG) represents the filter, Fish Swarm Algorithm (FSA) represents metaheuristics wrapper and Hybrid FSA-IG represents the hybrid technique. Five datasets with different number of features are used in these techniques. The dataset used are breast cancer, lung cancer, ovarian cancer, mixed-lineage leukaemia (MLL) and small round blue cell tumors (SRBCT). The result shown Hybrid FSA-IG managed to select least feature that represent significant feature for every dataset with improved performance of accuracy from 4.868% to 33.402% and 1.706% to 25.154% compared to IG and FSA respectively. 2020-07-09 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/92504/1/NurFatinLiyana2020_FeatureSelectionofHighDimensionalData.pdf Mohd. Rosely, Nur Fatin Liyana and Mohd. Zain, Azlan and Yusoff, Yusliza (2020) Feature selection of high dimensional data using Hybrid FSA-IG. In: 2nd Joint Conference on Green Engineering Technology and Applied Computing 2020, IConGETech 2020 and International Conference on Applied Computing 2020, ICAC 2020, 4 February 2020 - 5 February 2020, Bangkok, Thailand. http://dx.doi.org/10.1088/1757-899X/864/1/012066 |
spellingShingle | QA75 Electronic computers. Computer science Mohd. Rosely, Nur Fatin Liyana Mohd. Zain, Azlan Yusoff, Yusliza Feature selection of high dimensional data using Hybrid FSA-IG |
title | Feature selection of high dimensional data using Hybrid FSA-IG |
title_full | Feature selection of high dimensional data using Hybrid FSA-IG |
title_fullStr | Feature selection of high dimensional data using Hybrid FSA-IG |
title_full_unstemmed | Feature selection of high dimensional data using Hybrid FSA-IG |
title_short | Feature selection of high dimensional data using Hybrid FSA-IG |
title_sort | feature selection of high dimensional data using hybrid fsa ig |
topic | QA75 Electronic computers. Computer science |
url | http://eprints.utm.my/92504/1/NurFatinLiyana2020_FeatureSelectionofHighDimensionalData.pdf |
work_keys_str_mv | AT mohdroselynurfatinliyana featureselectionofhighdimensionaldatausinghybridfsaig AT mohdzainazlan featureselectionofhighdimensionaldatausinghybridfsaig AT yusoffyusliza featureselectionofhighdimensionaldatausinghybridfsaig |