Feature selection of high dimensional data using Hybrid FSA-IG

Feature selection (FS) is a process of selecting a subset of relevant features depends on the specific target variables especially when dealing with high dimensional dataset. The aim of this paper is to investigate the performance comparison of different feature selection techniques on high dimensio...

Full description

Bibliographic Details
Main Authors: Mohd. Rosely, Nur Fatin Liyana, Mohd. Zain, Azlan, Yusoff, Yusliza
Format: Conference or Workshop Item
Language:English
Published: 2020
Subjects:
Online Access:http://eprints.utm.my/92504/1/NurFatinLiyana2020_FeatureSelectionofHighDimensionalData.pdf
_version_ 1796865437756030976
author Mohd. Rosely, Nur Fatin Liyana
Mohd. Zain, Azlan
Yusoff, Yusliza
author_facet Mohd. Rosely, Nur Fatin Liyana
Mohd. Zain, Azlan
Yusoff, Yusliza
author_sort Mohd. Rosely, Nur Fatin Liyana
collection ePrints
description Feature selection (FS) is a process of selecting a subset of relevant features depends on the specific target variables especially when dealing with high dimensional dataset. The aim of this paper is to investigate the performance comparison of different feature selection techniques on high dimensional datasets. The techniques used are filter, wrapper and hybrid. Information gain (IG) represents the filter, Fish Swarm Algorithm (FSA) represents metaheuristics wrapper and Hybrid FSA-IG represents the hybrid technique. Five datasets with different number of features are used in these techniques. The dataset used are breast cancer, lung cancer, ovarian cancer, mixed-lineage leukaemia (MLL) and small round blue cell tumors (SRBCT). The result shown Hybrid FSA-IG managed to select least feature that represent significant feature for every dataset with improved performance of accuracy from 4.868% to 33.402% and 1.706% to 25.154% compared to IG and FSA respectively.
first_indexed 2024-03-05T20:56:59Z
format Conference or Workshop Item
id utm.eprints-92504
institution Universiti Teknologi Malaysia - ePrints
language English
last_indexed 2024-03-05T20:56:59Z
publishDate 2020
record_format dspace
spelling utm.eprints-925042021-09-30T15:14:56Z http://eprints.utm.my/92504/ Feature selection of high dimensional data using Hybrid FSA-IG Mohd. Rosely, Nur Fatin Liyana Mohd. Zain, Azlan Yusoff, Yusliza QA75 Electronic computers. Computer science Feature selection (FS) is a process of selecting a subset of relevant features depends on the specific target variables especially when dealing with high dimensional dataset. The aim of this paper is to investigate the performance comparison of different feature selection techniques on high dimensional datasets. The techniques used are filter, wrapper and hybrid. Information gain (IG) represents the filter, Fish Swarm Algorithm (FSA) represents metaheuristics wrapper and Hybrid FSA-IG represents the hybrid technique. Five datasets with different number of features are used in these techniques. The dataset used are breast cancer, lung cancer, ovarian cancer, mixed-lineage leukaemia (MLL) and small round blue cell tumors (SRBCT). The result shown Hybrid FSA-IG managed to select least feature that represent significant feature for every dataset with improved performance of accuracy from 4.868% to 33.402% and 1.706% to 25.154% compared to IG and FSA respectively. 2020-07-09 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/92504/1/NurFatinLiyana2020_FeatureSelectionofHighDimensionalData.pdf Mohd. Rosely, Nur Fatin Liyana and Mohd. Zain, Azlan and Yusoff, Yusliza (2020) Feature selection of high dimensional data using Hybrid FSA-IG. In: 2nd Joint Conference on Green Engineering Technology and Applied Computing 2020, IConGETech 2020 and International Conference on Applied Computing 2020, ICAC 2020, 4 February 2020 - 5 February 2020, Bangkok, Thailand. http://dx.doi.org/10.1088/1757-899X/864/1/012066
spellingShingle QA75 Electronic computers. Computer science
Mohd. Rosely, Nur Fatin Liyana
Mohd. Zain, Azlan
Yusoff, Yusliza
Feature selection of high dimensional data using Hybrid FSA-IG
title Feature selection of high dimensional data using Hybrid FSA-IG
title_full Feature selection of high dimensional data using Hybrid FSA-IG
title_fullStr Feature selection of high dimensional data using Hybrid FSA-IG
title_full_unstemmed Feature selection of high dimensional data using Hybrid FSA-IG
title_short Feature selection of high dimensional data using Hybrid FSA-IG
title_sort feature selection of high dimensional data using hybrid fsa ig
topic QA75 Electronic computers. Computer science
url http://eprints.utm.my/92504/1/NurFatinLiyana2020_FeatureSelectionofHighDimensionalData.pdf
work_keys_str_mv AT mohdroselynurfatinliyana featureselectionofhighdimensionaldatausinghybridfsaig
AT mohdzainazlan featureselectionofhighdimensionaldatausinghybridfsaig
AT yusoffyusliza featureselectionofhighdimensionaldatausinghybridfsaig