Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts

Abstract We validated a Deep Embedded Clustering (DEC) model and its adaptation for integrating mixed datatypes (in this study, numerical and categorical variables). Deep Embedded Clustering (DEC) is a promising technique capable of managing extensive sets of variables and non-linear relationships....

Full description

Bibliographic Details
Main Authors: Jip W. T. M. de Kok, Frank van Rosmalen, Jacqueline Koeze, Frederik Keus, Sander M. J. van Kuijk, José Castela Forte, Ronny M. Schnabel, Rob G. H. Driessen, Thijs T. W. van Herpt, Jan-Willem E. M. Sels, Dennis C. J. J. Bergmans, Chris P. H. Lexis, William P. T. M. van Doorn, Steven J. R. Meex, Minnan Xu, Xavier Borrat, Rachel Cavill, Iwan C. C. van der Horst, Bas C. T. van Bussel
Format: Article
Language:English
Published: Nature Portfolio 2024-01-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-51699-z
_version_ 1797355864494964736
author Jip W. T. M. de Kok
Frank van Rosmalen
Jacqueline Koeze
Frederik Keus
Sander M. J. van Kuijk
José Castela Forte
Ronny M. Schnabel
Rob G. H. Driessen
Thijs T. W. van Herpt
Jan-Willem E. M. Sels
Dennis C. J. J. Bergmans
Chris P. H. Lexis
William P. T. M. van Doorn
Steven J. R. Meex
Minnan Xu
Xavier Borrat
Rachel Cavill
Iwan C. C. van der Horst
Bas C. T. van Bussel
author_facet Jip W. T. M. de Kok
Frank van Rosmalen
Jacqueline Koeze
Frederik Keus
Sander M. J. van Kuijk
José Castela Forte
Ronny M. Schnabel
Rob G. H. Driessen
Thijs T. W. van Herpt
Jan-Willem E. M. Sels
Dennis C. J. J. Bergmans
Chris P. H. Lexis
William P. T. M. van Doorn
Steven J. R. Meex
Minnan Xu
Xavier Borrat
Rachel Cavill
Iwan C. C. van der Horst
Bas C. T. van Bussel
author_sort Jip W. T. M. de Kok
collection DOAJ
description Abstract We validated a Deep Embedded Clustering (DEC) model and its adaptation for integrating mixed datatypes (in this study, numerical and categorical variables). Deep Embedded Clustering (DEC) is a promising technique capable of managing extensive sets of variables and non-linear relationships. Nevertheless, DEC cannot adequately handle mixed datatypes. Therefore, we adapted DEC by replacing the autoencoder with an X-shaped variational autoencoder (XVAE) and optimising hyperparameters for cluster stability. We call this model “X-DEC”. We compared DEC and X-DEC by reproducing a previous study that used DEC to identify clusters in a population of intensive care patients. We assessed internal validity based on cluster stability on the development dataset. Since generalisability of clustering models has insufficiently been validated on external populations, we assessed external validity by investigating cluster generalisability onto an external validation dataset. We concluded that both DEC and X-DEC resulted in clinically recognisable and generalisable clusters, but X-DEC produced much more stable clusters.
first_indexed 2024-03-08T14:17:17Z
format Article
id doaj.art-867c03e4fffa4acaa2dd4b72c8bf55da
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-08T14:17:17Z
publishDate 2024-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-867c03e4fffa4acaa2dd4b72c8bf55da2024-01-14T12:19:34ZengNature PortfolioScientific Reports2045-23222024-01-0114111510.1038/s41598-024-51699-zDeep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohortsJip W. T. M. de Kok0Frank van Rosmalen1Jacqueline Koeze2Frederik Keus3Sander M. J. van Kuijk4José Castela Forte5Ronny M. Schnabel6Rob G. H. Driessen7Thijs T. W. van Herpt8Jan-Willem E. M. Sels9Dennis C. J. J. Bergmans10Chris P. H. Lexis11William P. T. M. van Doorn12Steven J. R. Meex13Minnan Xu14Xavier Borrat15Rachel Cavill16Iwan C. C. van der Horst17Bas C. T. van Bussel18Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Critical Care, University Medical Centre Groningen, University of GroningenDepartment of Critical Care, University Medical Centre Groningen, University of GroningenDepartment of Clinical Epidemiology and Medical Technical Assessment, Maastricht University Medical Centre+Department of Clinical Pharmacy and Pharmacology, University Medical Center Groningen, University of GroningenDepartment of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Cardiovascular Research Institute Maastricht (CARIM), Maastricht UniversityCardiovascular Research Institute Maastricht (CARIM), Maastricht UniversityTakeda PharmaceuticalsDepartment of Biostatistics Harvard T.H, Chan School of Public HealthDepartment of Advanced Computing Sciences, Maastricht UniversityDepartment of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Abstract We validated a Deep Embedded Clustering (DEC) model and its adaptation for integrating mixed datatypes (in this study, numerical and categorical variables). Deep Embedded Clustering (DEC) is a promising technique capable of managing extensive sets of variables and non-linear relationships. Nevertheless, DEC cannot adequately handle mixed datatypes. Therefore, we adapted DEC by replacing the autoencoder with an X-shaped variational autoencoder (XVAE) and optimising hyperparameters for cluster stability. We call this model “X-DEC”. We compared DEC and X-DEC by reproducing a previous study that used DEC to identify clusters in a population of intensive care patients. We assessed internal validity based on cluster stability on the development dataset. Since generalisability of clustering models has insufficiently been validated on external populations, we assessed external validity by investigating cluster generalisability onto an external validation dataset. We concluded that both DEC and X-DEC resulted in clinically recognisable and generalisable clusters, but X-DEC produced much more stable clusters.https://doi.org/10.1038/s41598-024-51699-z
spellingShingle Jip W. T. M. de Kok
Frank van Rosmalen
Jacqueline Koeze
Frederik Keus
Sander M. J. van Kuijk
José Castela Forte
Ronny M. Schnabel
Rob G. H. Driessen
Thijs T. W. van Herpt
Jan-Willem E. M. Sels
Dennis C. J. J. Bergmans
Chris P. H. Lexis
William P. T. M. van Doorn
Steven J. R. Meex
Minnan Xu
Xavier Borrat
Rachel Cavill
Iwan C. C. van der Horst
Bas C. T. van Bussel
Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts
Scientific Reports
title Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts
title_full Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts
title_fullStr Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts
title_full_unstemmed Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts
title_short Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts
title_sort deep embedded clustering generalisability and adaptation for integrating mixed datatypes two critical care cohorts
url https://doi.org/10.1038/s41598-024-51699-z
work_keys_str_mv AT jipwtmdekok deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT frankvanrosmalen deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT jacquelinekoeze deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT frederikkeus deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT sandermjvankuijk deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT josecastelaforte deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT ronnymschnabel deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT robghdriessen deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT thijstwvanherpt deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT janwillememsels deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT denniscjjbergmans deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT chrisphlexis deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT williamptmvandoorn deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT stevenjrmeex deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT minnanxu deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT xavierborrat deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT rachelcavill deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT iwanccvanderhorst deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts
AT basctvanbussel deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts