Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical Records

BackgroundThere is significant heterogeneity in disease progression among hospitalized patients with COVID-19. The pathogenesis of SARS-CoV-2 infection is attributed to a complex interplay between virus and host immune response that in some patients unpredictably and rapidly...

Full description

Bibliographic Details
Main Authors:	Tom Velez, Tony Wang, Brian Garibaldi, Eric Singman, Ioannis Koutroulis
Format:	Article
Language:	English
Published:	JMIR Publications 2023-10-01
Series:	JMIR Formative Research
Online Access:	https://formative.jmir.org/2023/1/e46807

_version_	1827798261971288064
author	Tom Velez Tony Wang Brian Garibaldi Eric Singman Ioannis Koutroulis
author_facet	Tom Velez Tony Wang Brian Garibaldi Eric Singman Ioannis Koutroulis
author_sort	Tom Velez
collection	DOAJ
description	BackgroundThere is significant heterogeneity in disease progression among hospitalized patients with COVID-19. The pathogenesis of SARS-CoV-2 infection is attributed to a complex interplay between virus and host immune response that in some patients unpredictably and rapidly leads to “hyperinflammation” associated with increased risk of mortality. The early identification of patients at risk of progression to hyperinflammation may help inform timely therapeutic decisions and lead to improved outcomes. ObjectiveThe primary objective of this study was to use machine learning to reproducibly identify specific risk-stratifying clinical phenotypes across hospitalized patients with COVID-19 and compare treatment response characteristics and outcomes. A secondary objective was to derive a predictive phenotype classification model using routinely available early encounter data that may be useful in informing optimal COVID-19 bedside clinical management. MethodsThis was a retrospective analysis of electronic health record data of adult patients (N=4379) who were admitted to a Johns Hopkins Health System hospital for COVID-19 treatment from 2020 to 2021. Phenotypes were identified by clustering 38 routine clinical observations recorded during inpatient care. To examine the reproducibility and validity of the derived phenotypes, patient data were randomly divided into 2 cohorts, and clustering analysis was performed independently for each cohort. A predictive phenotype classifier using the gradient-boosting machine method was derived using routine clinical observations recorded during the first 6 hours following admission. ResultsA total of 2 phenotypes (designated as phenotype 1 and phenotype 2) were identified in patients admitted for COVID-19 in both the training and validation cohorts with similar distributions of features, correlations with biomarkers, treatments, comorbidities, and outcomes. In both the training and validation cohorts, phenotype-2 patients were older; had elevated markers of inflammation; and were at an increased risk of requiring intensive care unit–level care, developing sepsis, and mortality compared with phenotype-1 patients. The gradient-boosting machine phenotype prediction model yielded an area under the curve of 0.89 and a positive predictive value of 0.83. ConclusionsUsing machine learning clustering, we identified and internally validated 2 clinical COVID-19 phenotypes with distinct treatment or response characteristics consistent with similar 2-phenotype models derived from other hospitalized populations with COVID-19, supporting the reliability and generalizability of these findings. COVID-19 phenotypes can be accurately identified using machine learning models based on readily available early encounter clinical data. A phenotype prediction model based on early encounter data may be clinically useful for timely bedside risk stratification and treatment personalization.
first_indexed	2024-03-11T19:34:17Z
format	Article
id	doaj.art-0b7dc5fdf3764a9e9b5e5cecd5ec904e
institution	Directory Open Access Journal
issn	2561-326X
language	English
last_indexed	2024-03-11T19:34:17Z
publishDate	2023-10-01
publisher	JMIR Publications
record_format	Article
series	JMIR Formative Research
spelling	doaj.art-0b7dc5fdf3764a9e9b5e5cecd5ec904e2023-10-06T13:13:54ZengJMIR PublicationsJMIR Formative Research2561-326X2023-10-017e4680710.2196/46807Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical RecordsTom Velezhttps://orcid.org/0000-0003-2157-9165Tony Wanghttps://orcid.org/0000-0002-6760-5456Brian Garibaldihttps://orcid.org/0000-0001-8632-5567Eric Singmanhttps://orcid.org/0000-0003-0327-4675Ioannis Koutroulishttps://orcid.org/0000-0002-8396-9022 BackgroundThere is significant heterogeneity in disease progression among hospitalized patients with COVID-19. The pathogenesis of SARS-CoV-2 infection is attributed to a complex interplay between virus and host immune response that in some patients unpredictably and rapidly leads to “hyperinflammation” associated with increased risk of mortality. The early identification of patients at risk of progression to hyperinflammation may help inform timely therapeutic decisions and lead to improved outcomes. ObjectiveThe primary objective of this study was to use machine learning to reproducibly identify specific risk-stratifying clinical phenotypes across hospitalized patients with COVID-19 and compare treatment response characteristics and outcomes. A secondary objective was to derive a predictive phenotype classification model using routinely available early encounter data that may be useful in informing optimal COVID-19 bedside clinical management. MethodsThis was a retrospective analysis of electronic health record data of adult patients (N=4379) who were admitted to a Johns Hopkins Health System hospital for COVID-19 treatment from 2020 to 2021. Phenotypes were identified by clustering 38 routine clinical observations recorded during inpatient care. To examine the reproducibility and validity of the derived phenotypes, patient data were randomly divided into 2 cohorts, and clustering analysis was performed independently for each cohort. A predictive phenotype classifier using the gradient-boosting machine method was derived using routine clinical observations recorded during the first 6 hours following admission. ResultsA total of 2 phenotypes (designated as phenotype 1 and phenotype 2) were identified in patients admitted for COVID-19 in both the training and validation cohorts with similar distributions of features, correlations with biomarkers, treatments, comorbidities, and outcomes. In both the training and validation cohorts, phenotype-2 patients were older; had elevated markers of inflammation; and were at an increased risk of requiring intensive care unit–level care, developing sepsis, and mortality compared with phenotype-1 patients. The gradient-boosting machine phenotype prediction model yielded an area under the curve of 0.89 and a positive predictive value of 0.83. ConclusionsUsing machine learning clustering, we identified and internally validated 2 clinical COVID-19 phenotypes with distinct treatment or response characteristics consistent with similar 2-phenotype models derived from other hospitalized populations with COVID-19, supporting the reliability and generalizability of these findings. COVID-19 phenotypes can be accurately identified using machine learning models based on readily available early encounter clinical data. A phenotype prediction model based on early encounter data may be clinically useful for timely bedside risk stratification and treatment personalization.https://formative.jmir.org/2023/1/e46807
spellingShingle	Tom Velez Tony Wang Brian Garibaldi Eric Singman Ioannis Koutroulis Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical Records JMIR Formative Research
title	Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical Records
title_full	Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical Records
title_fullStr	Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical Records
title_full_unstemmed	Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical Records
title_short	Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical Records
title_sort	identification and prediction of clinical phenotypes in hospitalized patients with covid 19 machine learning from medical records
url	https://formative.jmir.org/2023/1/e46807
work_keys_str_mv	AT tomvelez identificationandpredictionofclinicalphenotypesinhospitalizedpatientswithcovid19machinelearningfrommedicalrecords AT tonywang identificationandpredictionofclinicalphenotypesinhospitalizedpatientswithcovid19machinelearningfrommedicalrecords AT briangaribaldi identificationandpredictionofclinicalphenotypesinhospitalizedpatientswithcovid19machinelearningfrommedicalrecords AT ericsingman identificationandpredictionofclinicalphenotypesinhospitalizedpatientswithcovid19machinelearningfrommedicalrecords AT ioanniskoutroulis identificationandpredictionofclinicalphenotypesinhospitalizedpatientswithcovid19machinelearningfrommedicalrecords

Identification and Prediction of Clinical Phenotypes in Hospitalized Patients With COVID-19: Machine Learning From Medical Records

Similar Items