A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data
In machine learning and data mining, feature selection aims to seek a compact and discriminant feature subset from the original feature space. It is usually used as a preprocessing step to improve the prediction performance, understandability, scalability, and generalization capability of classifier...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8482117/ |
_version_ | 1828891796042678272 |
---|---|
author | Wenjun Ke Chunxue Wu Yan Wu Neal N. Xiong |
author_facet | Wenjun Ke Chunxue Wu Yan Wu Neal N. Xiong |
author_sort | Wenjun Ke |
collection | DOAJ |
description | In machine learning and data mining, feature selection aims to seek a compact and discriminant feature subset from the original feature space. It is usually used as a preprocessing step to improve the prediction performance, understandability, scalability, and generalization capability of classifiers. A typical gene microarray data set has the characteristics of high dimensionality, limited samples, and most irrelevant features, and these characteristics make it difficult to discover a compact set of features that really contribute to the response of the model. In this paper, a score-based criteria fusion feature selection method (SCF) is proposed for cancer prediction, and this method aims at improving the prediction performance of the classification model. The SCF method is evaluated on five open gene microarray data sets and three low-dimensional data sets, and it shows superior performance over many well-known feature selection methods when employing two classifiers SVM and KNN to measure the quality of selected features. Experiments verify that SCF is able to find more discriminative features than the competing methods and can be used as a preprocessing algorithm to combine with other methods effectively. |
first_indexed | 2024-12-13T13:23:28Z |
format | Article |
id | doaj.art-f1970f67447e4fb69de01e9b4da09d72 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-13T13:23:28Z |
publishDate | 2018-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-f1970f67447e4fb69de01e9b4da09d722022-12-21T23:44:21ZengIEEEIEEE Access2169-35362018-01-016610656107610.1109/ACCESS.2018.28736348482117A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray DataWenjun Ke0Chunxue Wu1https://orcid.org/0000-0003-4938-4570Yan Wu2Neal N. Xiong3https://orcid.org/0000-0002-0394-4635School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, ChinaSchool of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, ChinaSchool of Public and Environmental Affairs, Indiana University Bloomington, Bloomington, IN, USADepartment of Mathematics and Computer Science, Northeastern State University Tahlequah, Tahlequah, OK, USAIn machine learning and data mining, feature selection aims to seek a compact and discriminant feature subset from the original feature space. It is usually used as a preprocessing step to improve the prediction performance, understandability, scalability, and generalization capability of classifiers. A typical gene microarray data set has the characteristics of high dimensionality, limited samples, and most irrelevant features, and these characteristics make it difficult to discover a compact set of features that really contribute to the response of the model. In this paper, a score-based criteria fusion feature selection method (SCF) is proposed for cancer prediction, and this method aims at improving the prediction performance of the classification model. The SCF method is evaluated on five open gene microarray data sets and three low-dimensional data sets, and it shows superior performance over many well-known feature selection methods when employing two classifiers SVM and KNN to measure the quality of selected features. Experiments verify that SCF is able to find more discriminative features than the competing methods and can be used as a preprocessing algorithm to combine with other methods effectively.https://ieeexplore.ieee.org/document/8482117/Dimension reductionfeature selectionhigh-dimensional datacriteria fusioncancer prediction |
spellingShingle | Wenjun Ke Chunxue Wu Yan Wu Neal N. Xiong A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data IEEE Access Dimension reduction feature selection high-dimensional data criteria fusion cancer prediction |
title | A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data |
title_full | A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data |
title_fullStr | A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data |
title_full_unstemmed | A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data |
title_short | A New Filter Feature Selection Based on Criteria Fusion for Gene Microarray Data |
title_sort | new filter feature selection based on criteria fusion for gene microarray data |
topic | Dimension reduction feature selection high-dimensional data criteria fusion cancer prediction |
url | https://ieeexplore.ieee.org/document/8482117/ |
work_keys_str_mv | AT wenjunke anewfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata AT chunxuewu anewfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata AT yanwu anewfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata AT nealnxiong anewfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata AT wenjunke newfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata AT chunxuewu newfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata AT yanwu newfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata AT nealnxiong newfilterfeatureselectionbasedoncriteriafusionforgenemicroarraydata |