A flexible ensemble regression model of extreme learning machine for missing value imputation of DNA microarray data

Missing value imputation (MVI) is important for DNA microarray data analysis because microarray data with missing values would significantly degrade the performance of the downstream analysis. Although there have been lots of MVI algorithms for dealing with the missing DNA microarray data, we note t...

Full description

Bibliographic Details
Main Authors: Pan Xiuwei, Dong Wenlu, Yu Hualong
Format: Article
Language:English
Published: EDP Sciences 2023-01-01
Series:SHS Web of Conferences
Online Access:https://www.shs-conferences.org/articles/shsconf/pdf/2023/15/shsconf_eimm2023_01077.pdf
_version_ 1797829339580989440
author Pan Xiuwei
Dong Wenlu
Yu Hualong
author_facet Pan Xiuwei
Dong Wenlu
Yu Hualong
author_sort Pan Xiuwei
collection DOAJ
description Missing value imputation (MVI) is important for DNA microarray data analysis because microarray data with missing values would significantly degrade the performance of the downstream analysis. Although there have been lots of MVI algorithms for dealing with the missing DNA microarray data, we note that most of them have a lack of robustness owing to only adopting the single model. In this paper, a flexible and robust MVI algorithm named EELMimpute is proposed to address missing DNA microarray data imputation problem. First, the algorithm constructs a relevant feature space for the missing target gene, where the relevant feature space only includes those co-expression genes of the target gene based on calculating their Pearson's correlation coefficients. Then, some fix-sized feature subspaces are randomly extracted from the relevant feature space to construct extreme learning machine (ELM) regression models between the extracted genes and the target gene. Furthermore, selecting those models without missing input gene values to construct the ensemble framework, and then imputing the missing gene by calculating the average output of all models included in the ensemble framework. Experimental results show that the EELMimpute algorithm is able to reduce the estimated errors in comparison with several previous imputation algorithms.
first_indexed 2024-04-09T13:18:53Z
format Article
id doaj.art-663c2e06670d4effaa225dd3a31f131c
institution Directory Open Access Journal
issn 2261-2424
language English
last_indexed 2024-04-09T13:18:53Z
publishDate 2023-01-01
publisher EDP Sciences
record_format Article
series SHS Web of Conferences
spelling doaj.art-663c2e06670d4effaa225dd3a31f131c2023-05-11T09:13:44ZengEDP SciencesSHS Web of Conferences2261-24242023-01-011660107710.1051/shsconf/202316601077shsconf_eimm2023_01077A flexible ensemble regression model of extreme learning machine for missing value imputation of DNA microarray dataPan Xiuwei0Dong Wenlu1Yu Hualong2Beijing Huanjia Communication Technology Co., LtdSchool of Computer, Jiangsu University of Science and TechnologySchool of Computer, Jiangsu University of Science and TechnologyMissing value imputation (MVI) is important for DNA microarray data analysis because microarray data with missing values would significantly degrade the performance of the downstream analysis. Although there have been lots of MVI algorithms for dealing with the missing DNA microarray data, we note that most of them have a lack of robustness owing to only adopting the single model. In this paper, a flexible and robust MVI algorithm named EELMimpute is proposed to address missing DNA microarray data imputation problem. First, the algorithm constructs a relevant feature space for the missing target gene, where the relevant feature space only includes those co-expression genes of the target gene based on calculating their Pearson's correlation coefficients. Then, some fix-sized feature subspaces are randomly extracted from the relevant feature space to construct extreme learning machine (ELM) regression models between the extracted genes and the target gene. Furthermore, selecting those models without missing input gene values to construct the ensemble framework, and then imputing the missing gene by calculating the average output of all models included in the ensemble framework. Experimental results show that the EELMimpute algorithm is able to reduce the estimated errors in comparison with several previous imputation algorithms.https://www.shs-conferences.org/articles/shsconf/pdf/2023/15/shsconf_eimm2023_01077.pdf
spellingShingle Pan Xiuwei
Dong Wenlu
Yu Hualong
A flexible ensemble regression model of extreme learning machine for missing value imputation of DNA microarray data
SHS Web of Conferences
title A flexible ensemble regression model of extreme learning machine for missing value imputation of DNA microarray data
title_full A flexible ensemble regression model of extreme learning machine for missing value imputation of DNA microarray data
title_fullStr A flexible ensemble regression model of extreme learning machine for missing value imputation of DNA microarray data
title_full_unstemmed A flexible ensemble regression model of extreme learning machine for missing value imputation of DNA microarray data
title_short A flexible ensemble regression model of extreme learning machine for missing value imputation of DNA microarray data
title_sort flexible ensemble regression model of extreme learning machine for missing value imputation of dna microarray data
url https://www.shs-conferences.org/articles/shsconf/pdf/2023/15/shsconf_eimm2023_01077.pdf
work_keys_str_mv AT panxiuwei aflexibleensembleregressionmodelofextremelearningmachineformissingvalueimputationofdnamicroarraydata
AT dongwenlu aflexibleensembleregressionmodelofextremelearningmachineformissingvalueimputationofdnamicroarraydata
AT yuhualong aflexibleensembleregressionmodelofextremelearningmachineformissingvalueimputationofdnamicroarraydata
AT panxiuwei flexibleensembleregressionmodelofextremelearningmachineformissingvalueimputationofdnamicroarraydata
AT dongwenlu flexibleensembleregressionmodelofextremelearningmachineformissingvalueimputationofdnamicroarraydata
AT yuhualong flexibleensembleregressionmodelofextremelearningmachineformissingvalueimputationofdnamicroarraydata