Learning and visualizing chronic latent representations using electronic health records

Abstract Background Nowadays, patients with chronic diseases such as diabetes and hypertension have reached alarming numbers worldwide. These diseases increase the risk of developing acute complications and involve a substantial economic burden and demand for health resources. The widespread adoptio...

Full description

Bibliographic Details
Main Authors: David Chushig-Muzo, Cristina Soguero-Ruiz, Pablo de Miguel Bohoyo, Inmaculada Mora-Jiménez
Format: Article
Language:English
Published: BMC 2022-09-01
Series:BioData Mining
Subjects:
Online Access:https://doi.org/10.1186/s13040-022-00303-z
_version_ 1811211810781528064
author David Chushig-Muzo
Cristina Soguero-Ruiz
Pablo de Miguel Bohoyo
Inmaculada Mora-Jiménez
author_facet David Chushig-Muzo
Cristina Soguero-Ruiz
Pablo de Miguel Bohoyo
Inmaculada Mora-Jiménez
author_sort David Chushig-Muzo
collection DOAJ
description Abstract Background Nowadays, patients with chronic diseases such as diabetes and hypertension have reached alarming numbers worldwide. These diseases increase the risk of developing acute complications and involve a substantial economic burden and demand for health resources. The widespread adoption of Electronic Health Records (EHRs) is opening great opportunities for supporting decision-making. Nevertheless, data extracted from EHRs are complex (heterogeneous, high-dimensional and usually noisy), hampering the knowledge extraction with conventional approaches. Methods We propose the use of the Denoising Autoencoder (DAE), a Machine Learning (ML) technique allowing to transform high-dimensional data into latent representations (LRs), thus addressing the main challenges with clinical data. We explore in this work how the combination of LRs with a visualization method can be used to map the patient data in a two-dimensional space, gaining knowledge about the distribution of patients with different chronic conditions. Furthermore, this representation can be also used to characterize the patient’s health status evolution, which is of paramount importance in the clinical setting. Results To obtain clinical LRs, we considered real-world data extracted from EHRs linked to the University Hospital of Fuenlabrada in Spain. Experimental results showed the great potential of DAEs to identify patients with clinical patterns linked to hypertension, diabetes and multimorbidity. The procedure allowed us to find patients with the same main chronic disease but different clinical characteristics. Thus, we identified two kinds of diabetic patients with differences in their drug therapy (insulin and non-insulin dependant), and also a group of women affected by hypertension and gestational diabetes. We also present a proof of concept for mapping the health status evolution of synthetic patients when considering the most significant diagnoses and drugs associated with chronic patients. Conclusion Our results highlighted the value of ML techniques to extract clinical knowledge, supporting the identification of patients with certain chronic conditions. Furthermore, the patient’s health status progression on the two-dimensional space might be used as a tool for clinicians aiming to characterize health conditions and identify their more relevant clinical codes.
first_indexed 2024-04-12T05:19:34Z
format Article
id doaj.art-c7effac20a4b40f68e5d3ea442f12930
institution Directory Open Access Journal
issn 1756-0381
language English
last_indexed 2024-04-12T05:19:34Z
publishDate 2022-09-01
publisher BMC
record_format Article
series BioData Mining
spelling doaj.art-c7effac20a4b40f68e5d3ea442f129302022-12-22T03:46:33ZengBMCBioData Mining1756-03812022-09-0115112710.1186/s13040-022-00303-zLearning and visualizing chronic latent representations using electronic health recordsDavid Chushig-Muzo0Cristina Soguero-Ruiz1Pablo de Miguel Bohoyo2Inmaculada Mora-Jiménez3Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos UniversityDepartment of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos UniversityUniversity Hospital of FuenlabradaDepartment of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos UniversityAbstract Background Nowadays, patients with chronic diseases such as diabetes and hypertension have reached alarming numbers worldwide. These diseases increase the risk of developing acute complications and involve a substantial economic burden and demand for health resources. The widespread adoption of Electronic Health Records (EHRs) is opening great opportunities for supporting decision-making. Nevertheless, data extracted from EHRs are complex (heterogeneous, high-dimensional and usually noisy), hampering the knowledge extraction with conventional approaches. Methods We propose the use of the Denoising Autoencoder (DAE), a Machine Learning (ML) technique allowing to transform high-dimensional data into latent representations (LRs), thus addressing the main challenges with clinical data. We explore in this work how the combination of LRs with a visualization method can be used to map the patient data in a two-dimensional space, gaining knowledge about the distribution of patients with different chronic conditions. Furthermore, this representation can be also used to characterize the patient’s health status evolution, which is of paramount importance in the clinical setting. Results To obtain clinical LRs, we considered real-world data extracted from EHRs linked to the University Hospital of Fuenlabrada in Spain. Experimental results showed the great potential of DAEs to identify patients with clinical patterns linked to hypertension, diabetes and multimorbidity. The procedure allowed us to find patients with the same main chronic disease but different clinical characteristics. Thus, we identified two kinds of diabetic patients with differences in their drug therapy (insulin and non-insulin dependant), and also a group of women affected by hypertension and gestational diabetes. We also present a proof of concept for mapping the health status evolution of synthetic patients when considering the most significant diagnoses and drugs associated with chronic patients. Conclusion Our results highlighted the value of ML techniques to extract clinical knowledge, supporting the identification of patients with certain chronic conditions. Furthermore, the patient’s health status progression on the two-dimensional space might be used as a tool for clinicians aiming to characterize health conditions and identify their more relevant clinical codes.https://doi.org/10.1186/s13040-022-00303-zDenoising AutoencoderChronic diseasesDiabetesHypertensionClusteringPatient representation
spellingShingle David Chushig-Muzo
Cristina Soguero-Ruiz
Pablo de Miguel Bohoyo
Inmaculada Mora-Jiménez
Learning and visualizing chronic latent representations using electronic health records
BioData Mining
Denoising Autoencoder
Chronic diseases
Diabetes
Hypertension
Clustering
Patient representation
title Learning and visualizing chronic latent representations using electronic health records
title_full Learning and visualizing chronic latent representations using electronic health records
title_fullStr Learning and visualizing chronic latent representations using electronic health records
title_full_unstemmed Learning and visualizing chronic latent representations using electronic health records
title_short Learning and visualizing chronic latent representations using electronic health records
title_sort learning and visualizing chronic latent representations using electronic health records
topic Denoising Autoencoder
Chronic diseases
Diabetes
Hypertension
Clustering
Patient representation
url https://doi.org/10.1186/s13040-022-00303-z
work_keys_str_mv AT davidchushigmuzo learningandvisualizingchroniclatentrepresentationsusingelectronichealthrecords
AT cristinasogueroruiz learningandvisualizingchroniclatentrepresentationsusingelectronichealthrecords
AT pablodemiguelbohoyo learningandvisualizingchroniclatentrepresentationsusingelectronichealthrecords
AT inmaculadamorajimenez learningandvisualizingchroniclatentrepresentationsusingelectronichealthrecords