Multiple imputation using the average code from autoencoders

Background: Missing information is a constant issue in the clinical setting. The presence of missing values (MV) is triggered by the wrong acquisition of data or sudden events in the patient’s health condition. Imputation arises to replace the non-existent information with the twofold purpose of ben...

Full description

Bibliographic Details
Main Authors:	Edwar Macias, Javier Serrano, Jose Lopez Vicario, Antoni Morell
Format:	Article
Language:	English
Published:	Elsevier 2022-01-01
Series:	Computer Methods and Programs in Biomedicine Update
Subjects:	Average code Deep learning Autoencoder Multiple imputation
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666990022000052

_version_	1811178078773182464
author	Edwar Macias Javier Serrano Jose Lopez Vicario Antoni Morell
author_facet	Edwar Macias Javier Serrano Jose Lopez Vicario Antoni Morell
author_sort	Edwar Macias
collection	DOAJ
description	Background: Missing information is a constant issue in the clinical setting. The presence of missing values (MV) is triggered by the wrong acquisition of data or sudden events in the patient’s health condition. Imputation arises to replace the non-existent information with the twofold purpose of benefiting from existing information and reducing bias in clinical settings. Mechanisms based on deep learning and multiple imputation (MI) are leading alternatives to impute MVs because of their capacity to extract complex relationships and the consideration of uncertainty that MI adds. Objective: This study aims to improve the reconstruction of missing information through a novel imputation alternative that integrates a MI paradigm into deep learning models. Methods: The proposed method integrates the MI paradigm into the latent representations of an autoencoder, the so-called codes. The average code is then computed, boosting a better latent representation of data. Finally, the average code is decoded to reconstruct MVs. Results: The proposed method is tested in 6 datasets with different patters of MVs. It is compared with solutions based on autoencoders and generative adversarial networks. For the random appearance of MVs, the proposed method outperforms 97% of the scenarios with a reconstruction gain that ranges 1.04-1.45. For the other MVs mechanisms, the proposed method improves the reconstruction in at least 69% of the experiments, with a gain of 1.13-1.91. Conclusion: The findings of the proposed approach showed that the reconstructive capacity of the average code outperforms in most of the scenarios its competitors and close to the best solution in the rest of the scenarios. The integration of the MI paradigm into latent representations of data and the computation of average codes allow a more robust representation of the data and enables the enhancement of current state-of-the-art methods for high MVs rates.
first_indexed	2024-04-11T06:13:24Z
format	Article
id	doaj.art-c5cb5a9cb97e41099fc9fbd763306bc1
institution	Directory Open Access Journal
issn	2666-9900
language	English
last_indexed	2024-04-11T06:13:24Z
publishDate	2022-01-01
publisher	Elsevier
record_format	Article
series	Computer Methods and Programs in Biomedicine Update
spelling	doaj.art-c5cb5a9cb97e41099fc9fbd763306bc12022-12-22T04:41:09ZengElsevierComputer Methods and Programs in Biomedicine Update2666-99002022-01-012100053Multiple imputation using the average code from autoencodersEdwar Macias0Javier Serrano1Jose Lopez Vicario2Antoni Morell3Corresponding author.; Wireless Information Networking (WIN) Group - Universitat Autònoma de Barcelona (UAB), Bellaterra 08193, SpainWireless Information Networking (WIN) Group - Universitat Autònoma de Barcelona (UAB), Bellaterra 08193, SpainWireless Information Networking (WIN) Group - Universitat Autònoma de Barcelona (UAB), Bellaterra 08193, SpainWireless Information Networking (WIN) Group - Universitat Autònoma de Barcelona (UAB), Bellaterra 08193, SpainBackground: Missing information is a constant issue in the clinical setting. The presence of missing values (MV) is triggered by the wrong acquisition of data or sudden events in the patient’s health condition. Imputation arises to replace the non-existent information with the twofold purpose of benefiting from existing information and reducing bias in clinical settings. Mechanisms based on deep learning and multiple imputation (MI) are leading alternatives to impute MVs because of their capacity to extract complex relationships and the consideration of uncertainty that MI adds. Objective: This study aims to improve the reconstruction of missing information through a novel imputation alternative that integrates a MI paradigm into deep learning models. Methods: The proposed method integrates the MI paradigm into the latent representations of an autoencoder, the so-called codes. The average code is then computed, boosting a better latent representation of data. Finally, the average code is decoded to reconstruct MVs. Results: The proposed method is tested in 6 datasets with different patters of MVs. It is compared with solutions based on autoencoders and generative adversarial networks. For the random appearance of MVs, the proposed method outperforms 97% of the scenarios with a reconstruction gain that ranges 1.04-1.45. For the other MVs mechanisms, the proposed method improves the reconstruction in at least 69% of the experiments, with a gain of 1.13-1.91. Conclusion: The findings of the proposed approach showed that the reconstructive capacity of the average code outperforms in most of the scenarios its competitors and close to the best solution in the rest of the scenarios. The integration of the MI paradigm into latent representations of data and the computation of average codes allow a more robust representation of the data and enables the enhancement of current state-of-the-art methods for high MVs rates.http://www.sciencedirect.com/science/article/pii/S2666990022000052Average codeDeep learningAutoencoderMultiple imputation
spellingShingle	Edwar Macias Javier Serrano Jose Lopez Vicario Antoni Morell Multiple imputation using the average code from autoencoders Computer Methods and Programs in Biomedicine Update Average code Deep learning Autoencoder Multiple imputation
title	Multiple imputation using the average code from autoencoders
title_full	Multiple imputation using the average code from autoencoders
title_fullStr	Multiple imputation using the average code from autoencoders
title_full_unstemmed	Multiple imputation using the average code from autoencoders
title_short	Multiple imputation using the average code from autoencoders
title_sort	multiple imputation using the average code from autoencoders
topic	Average code Deep learning Autoencoder Multiple imputation
url	http://www.sciencedirect.com/science/article/pii/S2666990022000052
work_keys_str_mv	AT edwarmacias multipleimputationusingtheaveragecodefromautoencoders AT javierserrano multipleimputationusingtheaveragecodefromautoencoders AT joselopezvicario multipleimputationusingtheaveragecodefromautoencoders AT antonimorell multipleimputationusingtheaveragecodefromautoencoders

Multiple imputation using the average code from autoencoders

Similar Items