An evolutionary computation classification method for high‐dimensional mixed missing variables data

Abstract Data missing is a prevalent issue in various real‐world systems. It may deteriorate the performance of classification algorithms running on these platforms. Numerous effective imputation methods exist to address this problem. However, traditional data imputation approaches mainly focus on l...

Full description

Bibliographic Details
Main Authors:	Mengmeng Li, Yi Liu, Qibin Zheng, Gengsong Li, Wei Qin
Format:	Article
Language:	English
Published:	Wiley 2023-12-01
Series:	Electronics Letters
Subjects:	evolutionary computation feature selection pattern classification
Online Access:	https://doi.org/10.1049/ell2.13058

_version_	1797362151601471488
author	Mengmeng Li Yi Liu Qibin Zheng Gengsong Li Wei Qin
author_facet	Mengmeng Li Yi Liu Qibin Zheng Gengsong Li Wei Qin
author_sort	Mengmeng Li
collection	DOAJ
description	Abstract Data missing is a prevalent issue in various real‐world systems. It may deteriorate the performance of classification algorithms running on these platforms. Numerous effective imputation methods exist to address this problem. However, traditional data imputation approaches mainly focus on low‐dimensional missing data, and in addition, they do not make use of the randomness of the missing values and the information of labels simultaneously. To solve these problems, the authors propose a novel data imputation algorithm, named Particle Swarm Optimization for High‐dimensional mixed Missing variables data (PSOHM). PSOHM introduces a feature filtering algorithm for feature selection on missing data, followed by a feature discrimination method to categorize chosen features. PSOHM then employs particle swarm optimization to optimize imputation functions for both continuous and discrete features. Continuous features are modelled as Gaussian distributions, with the mean and standard deviation encoded into particles. Additionally, the probabilities of values for discrete features are also encoded. Moreover, accuracy serves as the optimization objective, utilizing both the randomness of missing values and the label information to improve the algorithm's performance. Six typical algorithms are employed to make a comparison. The results demonstrate that the authors’ method is superior to the compared approaches on the six different kinds of classical datasets.
first_indexed	2024-03-08T16:04:16Z
format	Article
id	doaj.art-37ff32d549654147913611b1a63f698d
institution	Directory Open Access Journal
issn	0013-5194 1350-911X
language	English
last_indexed	2024-03-08T16:04:16Z
publishDate	2023-12-01
publisher	Wiley
record_format	Article
series	Electronics Letters
spelling	doaj.art-37ff32d549654147913611b1a63f698d2024-01-08T08:30:54ZengWileyElectronics Letters0013-51941350-911X2023-12-015924n/an/a10.1049/ell2.13058An evolutionary computation classification method for high‐dimensional mixed missing variables dataMengmeng Li0Yi Liu1Qibin Zheng2Gengsong Li3Wei Qin4Academy of Military Sciences Beijing ChinaAcademy of Military Sciences Beijing ChinaAcademy of Military Sciences Beijing ChinaNational Innovation Institute of Defense Technology Beijing ChinaAcademy of Military Sciences Beijing ChinaAbstract Data missing is a prevalent issue in various real‐world systems. It may deteriorate the performance of classification algorithms running on these platforms. Numerous effective imputation methods exist to address this problem. However, traditional data imputation approaches mainly focus on low‐dimensional missing data, and in addition, they do not make use of the randomness of the missing values and the information of labels simultaneously. To solve these problems, the authors propose a novel data imputation algorithm, named Particle Swarm Optimization for High‐dimensional mixed Missing variables data (PSOHM). PSOHM introduces a feature filtering algorithm for feature selection on missing data, followed by a feature discrimination method to categorize chosen features. PSOHM then employs particle swarm optimization to optimize imputation functions for both continuous and discrete features. Continuous features are modelled as Gaussian distributions, with the mean and standard deviation encoded into particles. Additionally, the probabilities of values for discrete features are also encoded. Moreover, accuracy serves as the optimization objective, utilizing both the randomness of missing values and the label information to improve the algorithm's performance. Six typical algorithms are employed to make a comparison. The results demonstrate that the authors’ method is superior to the compared approaches on the six different kinds of classical datasets.https://doi.org/10.1049/ell2.13058evolutionary computationfeature selectionpattern classification
spellingShingle	Mengmeng Li Yi Liu Qibin Zheng Gengsong Li Wei Qin An evolutionary computation classification method for high‐dimensional mixed missing variables data Electronics Letters evolutionary computation feature selection pattern classification
title	An evolutionary computation classification method for high‐dimensional mixed missing variables data
title_full	An evolutionary computation classification method for high‐dimensional mixed missing variables data
title_fullStr	An evolutionary computation classification method for high‐dimensional mixed missing variables data
title_full_unstemmed	An evolutionary computation classification method for high‐dimensional mixed missing variables data
title_short	An evolutionary computation classification method for high‐dimensional mixed missing variables data
title_sort	evolutionary computation classification method for high dimensional mixed missing variables data
topic	evolutionary computation feature selection pattern classification
url	https://doi.org/10.1049/ell2.13058
work_keys_str_mv	AT mengmengli anevolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT yiliu anevolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT qibinzheng anevolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT gengsongli anevolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT weiqin anevolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT mengmengli evolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT yiliu evolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT qibinzheng evolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT gengsongli evolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata AT weiqin evolutionarycomputationclassificationmethodforhighdimensionalmixedmissingvariablesdata

An evolutionary computation classification method for high‐dimensional mixed missing variables data

Similar Items