A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data

In machine learning and data mining, feature selection aims to seek a compact and discriminant feature subset from the original feature space. It is usually used as a preprocessing step to improve the prediction performance, understandability, scalability, and generalization capability of classifier...

Full description

Bibliographic Details
Main Authors: Wenjun Ke, Chunxue Wu, Yan Wu, Neal N. Xiong
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8482117/
_version_ 1828891796042678272
author Wenjun Ke
Chunxue Wu
Yan Wu
Neal N. Xiong
author_facet Wenjun Ke
Chunxue Wu
Yan Wu
Neal N. Xiong
author_sort Wenjun Ke
collection DOAJ
description In machine learning and data mining, feature selection aims to seek a compact and discriminant feature subset from the original feature space. It is usually used as a preprocessing step to improve the prediction performance, understandability, scalability, and generalization capability of classifiers. A typical gene microarray data set has the characteristics of high dimensionality, limited samples, and most irrelevant features, and these characteristics make it difficult to discover a compact set of features that really contribute to the response of the model. In this paper, a score-based criteria fusion feature selection method (SCF) is proposed for cancer prediction, and this method aims at improving the prediction performance of the classification model. The SCF method is evaluated on five open gene microarray data sets and three low-dimensional data sets, and it shows superior performance over many well-known feature selection methods when employing two classifiers SVM and KNN to measure the quality of selected features. Experiments verify that SCF is able to find more discriminative features than the competing methods and can be used as a preprocessing algorithm to combine with other methods effectively.
first_indexed 2024-12-13T13:23:28Z
format Article
id doaj.art-f1970f67447e4fb69de01e9b4da09d72
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-13T13:23:28Z
publishDate 2018-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-f1970f67447e4fb69de01e9b4da09d722022-12-21T23:44:21ZengIEEEIEEE Access2169-35362018-01-016610656107610.1109/ACCESS.2018.28736348482117A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray DataWenjun Ke0Chunxue Wu1https://orcid.org/0000-0003-4938-4570Yan Wu2Neal N. Xiong3https://orcid.org/0000-0002-0394-4635School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, ChinaSchool of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, ChinaSchool of Public and Environmental Affairs, Indiana University Bloomington, Bloomington, IN, USADepartment of Mathematics and Computer Science, Northeastern State University Tahlequah, Tahlequah, OK, USAIn machine learning and data mining, feature selection aims to seek a compact and discriminant feature subset from the original feature space. It is usually used as a preprocessing step to improve the prediction performance, understandability, scalability, and generalization capability of classifiers. A typical gene microarray data set has the characteristics of high dimensionality, limited samples, and most irrelevant features, and these characteristics make it difficult to discover a compact set of features that really contribute to the response of the model. In this paper, a score-based criteria fusion feature selection method (SCF) is proposed for cancer prediction, and this method aims at improving the prediction performance of the classification model. The SCF method is evaluated on five open gene microarray data sets and three low-dimensional data sets, and it shows superior performance over many well-known feature selection methods when employing two classifiers SVM and KNN to measure the quality of selected features. Experiments verify that SCF is able to find more discriminative features than the competing methods and can be used as a preprocessing algorithm to combine with other methods effectively.https://ieeexplore.ieee.org/document/8482117/Dimension reductionfeature selectionhigh-dimensional datacriteria fusioncancer prediction
spellingShingle Wenjun Ke
Chunxue Wu
Yan Wu
Neal N. Xiong
A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data
IEEE Access
Dimension reduction
feature selection
high-dimensional data
criteria fusion
cancer prediction
title A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data
title_full A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data
title_fullStr A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data
title_full_unstemmed A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data
title_short A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data
title_sort new filter feature selection based on criteria fusion for gene microarray data
topic Dimension reduction
feature selection
high-dimensional data
criteria fusion
cancer prediction
url https://ieeexplore.ieee.org/document/8482117/
work_keys_str_mv AT wenjunke anewfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata
AT chunxuewu anewfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata
AT yanwu anewfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata
AT nealnxiong anewfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata
AT wenjunke newfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata
AT chunxuewu newfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata
AT yanwu newfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata
AT nealnxiong newfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata