Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts
Abstract We validated a Deep Embedded Clustering (DEC) model and its adaptation for integrating mixed datatypes (in this study, numerical and categorical variables). Deep Embedded Clustering (DEC) is a promising technique capable of managing extensive sets of variables and non-linear relationships....
Main Authors: | , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-01-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-024-51699-z |
_version_ | 1797355864494964736 |
---|---|
author | Jip W. T. M. de Kok Frank van Rosmalen Jacqueline Koeze Frederik Keus Sander M. J. van Kuijk José Castela Forte Ronny M. Schnabel Rob G. H. Driessen Thijs T. W. van Herpt Jan-Willem E. M. Sels Dennis C. J. J. Bergmans Chris P. H. Lexis William P. T. M. van Doorn Steven J. R. Meex Minnan Xu Xavier Borrat Rachel Cavill Iwan C. C. van der Horst Bas C. T. van Bussel |
author_facet | Jip W. T. M. de Kok Frank van Rosmalen Jacqueline Koeze Frederik Keus Sander M. J. van Kuijk José Castela Forte Ronny M. Schnabel Rob G. H. Driessen Thijs T. W. van Herpt Jan-Willem E. M. Sels Dennis C. J. J. Bergmans Chris P. H. Lexis William P. T. M. van Doorn Steven J. R. Meex Minnan Xu Xavier Borrat Rachel Cavill Iwan C. C. van der Horst Bas C. T. van Bussel |
author_sort | Jip W. T. M. de Kok |
collection | DOAJ |
description | Abstract We validated a Deep Embedded Clustering (DEC) model and its adaptation for integrating mixed datatypes (in this study, numerical and categorical variables). Deep Embedded Clustering (DEC) is a promising technique capable of managing extensive sets of variables and non-linear relationships. Nevertheless, DEC cannot adequately handle mixed datatypes. Therefore, we adapted DEC by replacing the autoencoder with an X-shaped variational autoencoder (XVAE) and optimising hyperparameters for cluster stability. We call this model “X-DEC”. We compared DEC and X-DEC by reproducing a previous study that used DEC to identify clusters in a population of intensive care patients. We assessed internal validity based on cluster stability on the development dataset. Since generalisability of clustering models has insufficiently been validated on external populations, we assessed external validity by investigating cluster generalisability onto an external validation dataset. We concluded that both DEC and X-DEC resulted in clinically recognisable and generalisable clusters, but X-DEC produced much more stable clusters. |
first_indexed | 2024-03-08T14:17:17Z |
format | Article |
id | doaj.art-867c03e4fffa4acaa2dd4b72c8bf55da |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-03-08T14:17:17Z |
publishDate | 2024-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-867c03e4fffa4acaa2dd4b72c8bf55da2024-01-14T12:19:34ZengNature PortfolioScientific Reports2045-23222024-01-0114111510.1038/s41598-024-51699-zDeep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohortsJip W. T. M. de Kok0Frank van Rosmalen1Jacqueline Koeze2Frederik Keus3Sander M. J. van Kuijk4José Castela Forte5Ronny M. Schnabel6Rob G. H. Driessen7Thijs T. W. van Herpt8Jan-Willem E. M. Sels9Dennis C. J. J. Bergmans10Chris P. H. Lexis11William P. T. M. van Doorn12Steven J. R. Meex13Minnan Xu14Xavier Borrat15Rachel Cavill16Iwan C. C. van der Horst17Bas C. T. van Bussel18Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Critical Care, University Medical Centre Groningen, University of GroningenDepartment of Critical Care, University Medical Centre Groningen, University of GroningenDepartment of Clinical Epidemiology and Medical Technical Assessment, Maastricht University Medical Centre+Department of Clinical Pharmacy and Pharmacology, University Medical Center Groningen, University of GroningenDepartment of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Cardiovascular Research Institute Maastricht (CARIM), Maastricht UniversityCardiovascular Research Institute Maastricht (CARIM), Maastricht UniversityTakeda PharmaceuticalsDepartment of Biostatistics Harvard T.H, Chan School of Public HealthDepartment of Advanced Computing Sciences, Maastricht UniversityDepartment of Intensive Care Medicine, Maastricht University Medical Centre+Department of Intensive Care Medicine, Maastricht University Medical Centre+Abstract We validated a Deep Embedded Clustering (DEC) model and its adaptation for integrating mixed datatypes (in this study, numerical and categorical variables). Deep Embedded Clustering (DEC) is a promising technique capable of managing extensive sets of variables and non-linear relationships. Nevertheless, DEC cannot adequately handle mixed datatypes. Therefore, we adapted DEC by replacing the autoencoder with an X-shaped variational autoencoder (XVAE) and optimising hyperparameters for cluster stability. We call this model “X-DEC”. We compared DEC and X-DEC by reproducing a previous study that used DEC to identify clusters in a population of intensive care patients. We assessed internal validity based on cluster stability on the development dataset. Since generalisability of clustering models has insufficiently been validated on external populations, we assessed external validity by investigating cluster generalisability onto an external validation dataset. We concluded that both DEC and X-DEC resulted in clinically recognisable and generalisable clusters, but X-DEC produced much more stable clusters.https://doi.org/10.1038/s41598-024-51699-z |
spellingShingle | Jip W. T. M. de Kok Frank van Rosmalen Jacqueline Koeze Frederik Keus Sander M. J. van Kuijk José Castela Forte Ronny M. Schnabel Rob G. H. Driessen Thijs T. W. van Herpt Jan-Willem E. M. Sels Dennis C. J. J. Bergmans Chris P. H. Lexis William P. T. M. van Doorn Steven J. R. Meex Minnan Xu Xavier Borrat Rachel Cavill Iwan C. C. van der Horst Bas C. T. van Bussel Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts Scientific Reports |
title | Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts |
title_full | Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts |
title_fullStr | Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts |
title_full_unstemmed | Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts |
title_short | Deep embedded clustering generalisability and adaptation for integrating mixed datatypes: two critical care cohorts |
title_sort | deep embedded clustering generalisability and adaptation for integrating mixed datatypes two critical care cohorts |
url | https://doi.org/10.1038/s41598-024-51699-z |
work_keys_str_mv | AT jipwtmdekok deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT frankvanrosmalen deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT jacquelinekoeze deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT frederikkeus deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT sandermjvankuijk deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT josecastelaforte deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT ronnymschnabel deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT robghdriessen deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT thijstwvanherpt deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT janwillememsels deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT denniscjjbergmans deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT chrisphlexis deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT williamptmvandoorn deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT stevenjrmeex deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT minnanxu deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT xavierborrat deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT rachelcavill deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT iwanccvanderhorst deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts AT basctvanbussel deepembeddedclusteringgeneralisabilityandadaptationforintegratingmixeddatatypestwocriticalcarecohorts |