A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers

© 2018 by The Society for Healthcare Epidemiology of America. All rights reserved. OBJECTIVE An estimated 293,300 healthcare-associated cases of Clostridium difficile infection (CDI) occur annually in the United States. To date, research has focused on developing risk prediction models for CDI that...

Full description

Bibliographic Details
Main Authors: Oh, Jeeheh, Makar, Maggie, Fusco, Christopher, McCaffrey, Robert, Rao, Krishna, Ryan, Erin E, Washer, Laraine, West, Lauren R, Young, Vincent B, Guttag, John, Hooper, David C, Shenoy, Erica S, Wiens, Jenna
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:English
Published: Cambridge University Press (CUP) 2021
Online Access:https://hdl.handle.net/1721.1/133418
_version_ 1826211696561094656
author Oh, Jeeheh
Makar, Maggie
Fusco, Christopher
McCaffrey, Robert
Rao, Krishna
Ryan, Erin E
Washer, Laraine
West, Lauren R
Young, Vincent B
Guttag, John
Hooper, David C
Shenoy, Erica S
Wiens, Jenna
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Oh, Jeeheh
Makar, Maggie
Fusco, Christopher
McCaffrey, Robert
Rao, Krishna
Ryan, Erin E
Washer, Laraine
West, Lauren R
Young, Vincent B
Guttag, John
Hooper, David C
Shenoy, Erica S
Wiens, Jenna
author_sort Oh, Jeeheh
collection MIT
description © 2018 by The Society for Healthcare Epidemiology of America. All rights reserved. OBJECTIVE An estimated 293,300 healthcare-associated cases of Clostridium difficile infection (CDI) occur annually in the United States. To date, research has focused on developing risk prediction models for CDI that work well across institutions. However, this one-size-fits-all approach ignores important hospital-specific factors. We focus on a generalizable method for building facility-specific models. We demonstrate the applicability of the approach using electronic health records (EHR) from the University of Michigan Hospitals (UM) and the Massachusetts General Hospital (MGH). METHODS We utilized EHR data from 191,014 adult admissions to UM and 65,718 adult admissions to MGH. We extracted patient demographics, admission details, patient history, and daily hospitalization details, resulting in 4,836 features from patients at UM and 1,837 from patients at MGH. We used L2 regularized logistic regression to learn the models, and we measured the discriminative performance of the models on held-out data from each hospital. RESULTS Using the UM and MGH test data, the models achieved area under the receiver operating characteristic curve (AUROC) values of 0.82 (95% confidence interval [CI], 0.80-0.84) and 0.75 ( 95% CI, 0.73-0.78), respectively. Some predictive factors were shared between the 2 models, but many of the top predictive factors differed between facilities. CONCLUSION A data-driven approach to building models for estimating daily patient risk for CDI was used to build institution-specific models at 2 large hospitals with different patient populations and EHR systems. In contrast to traditional approaches that focus on developing models that apply across hospitals, our generalizable approach yields risk-stratification models tailored to an institution. These hospital-specific models allow for earlier and more accurate identification of high-risk patients and better targeting of infection prevention strategies. Infect Control Hosp Epidemiol 2018;39:425-433
first_indexed 2024-09-23T15:10:04Z
format Article
id mit-1721.1/133418
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T15:10:04Z
publishDate 2021
publisher Cambridge University Press (CUP)
record_format dspace
spelling mit-1721.1/1334182023-03-02T16:02:35Z A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers Oh, Jeeheh Makar, Maggie Fusco, Christopher McCaffrey, Robert Rao, Krishna Ryan, Erin E Washer, Laraine West, Lauren R Young, Vincent B Guttag, John Hooper, David C Shenoy, Erica S Wiens, Jenna Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science © 2018 by The Society for Healthcare Epidemiology of America. All rights reserved. OBJECTIVE An estimated 293,300 healthcare-associated cases of Clostridium difficile infection (CDI) occur annually in the United States. To date, research has focused on developing risk prediction models for CDI that work well across institutions. However, this one-size-fits-all approach ignores important hospital-specific factors. We focus on a generalizable method for building facility-specific models. We demonstrate the applicability of the approach using electronic health records (EHR) from the University of Michigan Hospitals (UM) and the Massachusetts General Hospital (MGH). METHODS We utilized EHR data from 191,014 adult admissions to UM and 65,718 adult admissions to MGH. We extracted patient demographics, admission details, patient history, and daily hospitalization details, resulting in 4,836 features from patients at UM and 1,837 from patients at MGH. We used L2 regularized logistic regression to learn the models, and we measured the discriminative performance of the models on held-out data from each hospital. RESULTS Using the UM and MGH test data, the models achieved area under the receiver operating characteristic curve (AUROC) values of 0.82 (95% confidence interval [CI], 0.80-0.84) and 0.75 ( 95% CI, 0.73-0.78), respectively. Some predictive factors were shared between the 2 models, but many of the top predictive factors differed between facilities. CONCLUSION A data-driven approach to building models for estimating daily patient risk for CDI was used to build institution-specific models at 2 large hospitals with different patient populations and EHR systems. In contrast to traditional approaches that focus on developing models that apply across hospitals, our generalizable approach yields risk-stratification models tailored to an institution. These hospital-specific models allow for earlier and more accurate identification of high-risk patients and better targeting of infection prevention strategies. Infect Control Hosp Epidemiol 2018;39:425-433 2021-10-27T19:52:46Z 2021-10-27T19:52:46Z 2018 2019-05-30T14:29:09Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/133418 en 10.1017/ICE.2018.16 Infection Control and Hospital Epidemiology Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Cambridge University Press (CUP) PMC
spellingShingle Oh, Jeeheh
Makar, Maggie
Fusco, Christopher
McCaffrey, Robert
Rao, Krishna
Ryan, Erin E
Washer, Laraine
West, Lauren R
Young, Vincent B
Guttag, John
Hooper, David C
Shenoy, Erica S
Wiens, Jenna
A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers
title A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers
title_full A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers
title_fullStr A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers
title_full_unstemmed A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers
title_short A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers
title_sort generalizable data driven approach to predict daily risk of clostridium difficile infection at two large academic health centers
url https://hdl.handle.net/1721.1/133418
work_keys_str_mv AT ohjeeheh ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT makarmaggie ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT fuscochristopher ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT mccaffreyrobert ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT raokrishna ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT ryanerine ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT washerlaraine ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT westlaurenr ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT youngvincentb ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT guttagjohn ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT hooperdavidc ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT shenoyericas ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT wiensjenna ageneralizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT ohjeeheh generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT makarmaggie generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT fuscochristopher generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT mccaffreyrobert generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT raokrishna generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT ryanerine generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT washerlaraine generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT westlaurenr generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT youngvincentb generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT guttagjohn generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT hooperdavidc generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT shenoyericas generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters
AT wiensjenna generalizabledatadrivenapproachtopredictdailyriskofclostridiumdifficileinfectionattwolargeacademichealthcenters