Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients.
BACKGROUND:Assessment of the response to the 2014-15 Ebola outbreak indicates the need for innovations in data collection, sharing, and use to improve case detection and treatment. Here we introduce a Machine Learning pipeline for Ebola Virus Disease (EVD) prognosis prediction, which packages the be...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2016-03-01
|
Series: | PLoS Neglected Tropical Diseases |
Online Access: | http://europepmc.org/articles/PMC4798608?pdf=render |
_version_ | 1818197855761858560 |
---|---|
author | Andres Colubri Tom Silver Terrence Fradet Kalliroi Retzepi Ben Fry Pardis Sabeti |
author_facet | Andres Colubri Tom Silver Terrence Fradet Kalliroi Retzepi Ben Fry Pardis Sabeti |
author_sort | Andres Colubri |
collection | DOAJ |
description | BACKGROUND:Assessment of the response to the 2014-15 Ebola outbreak indicates the need for innovations in data collection, sharing, and use to improve case detection and treatment. Here we introduce a Machine Learning pipeline for Ebola Virus Disease (EVD) prognosis prediction, which packages the best models into a mobile app to be available in clinical care settings. The pipeline was trained on a public EVD clinical dataset, from 106 patients in Sierra Leone. METHODS/PRINCIPAL FINDINGS:We used a new tool for exploratory analysis, Mirador, to identify the most informative clinical factors that correlate with EVD outcome. The small sample size and high prevalence of missing records were significant challenges. We applied multiple imputation and bootstrap sampling to address missing data and quantify overfitting. We trained several predictors over all combinations of covariates, which resulted in an ensemble of predictors, with and without viral load information, with an area under the receiver operator characteristic curve of 0.8 or more, after correcting for optimistic bias. We ranked the predictors by their F1-score, and those above a set threshold were compiled into a mobile app, Ebola CARE (Computational Assignment of Risk Estimates). CONCLUSIONS/SIGNIFICANCE:This method demonstrates how to address small sample sizes and missing data, while creating predictive models that can be readily deployed to assist treatment in future outbreaks of EVD and other infectious diseases. By generating an ensemble of predictors instead of relying on a single model, we are able to handle situations where patient data is partially available. The prognosis app can be updated as new data become available, and we made all the computational protocols fully documented and open-sourced to encourage timely data sharing, independent validation, and development of better prediction models in outbreak response. |
first_indexed | 2024-12-12T01:56:37Z |
format | Article |
id | doaj.art-4c0a41f4e8194662a9e0aa83507c7dfc |
institution | Directory Open Access Journal |
issn | 1935-2727 1935-2735 |
language | English |
last_indexed | 2024-12-12T01:56:37Z |
publishDate | 2016-03-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Neglected Tropical Diseases |
spelling | doaj.art-4c0a41f4e8194662a9e0aa83507c7dfc2022-12-22T00:42:21ZengPublic Library of Science (PLoS)PLoS Neglected Tropical Diseases1935-27271935-27352016-03-01103e000454910.1371/journal.pntd.0004549Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients.Andres ColubriTom SilverTerrence FradetKalliroi RetzepiBen FryPardis SabetiBACKGROUND:Assessment of the response to the 2014-15 Ebola outbreak indicates the need for innovations in data collection, sharing, and use to improve case detection and treatment. Here we introduce a Machine Learning pipeline for Ebola Virus Disease (EVD) prognosis prediction, which packages the best models into a mobile app to be available in clinical care settings. The pipeline was trained on a public EVD clinical dataset, from 106 patients in Sierra Leone. METHODS/PRINCIPAL FINDINGS:We used a new tool for exploratory analysis, Mirador, to identify the most informative clinical factors that correlate with EVD outcome. The small sample size and high prevalence of missing records were significant challenges. We applied multiple imputation and bootstrap sampling to address missing data and quantify overfitting. We trained several predictors over all combinations of covariates, which resulted in an ensemble of predictors, with and without viral load information, with an area under the receiver operator characteristic curve of 0.8 or more, after correcting for optimistic bias. We ranked the predictors by their F1-score, and those above a set threshold were compiled into a mobile app, Ebola CARE (Computational Assignment of Risk Estimates). CONCLUSIONS/SIGNIFICANCE:This method demonstrates how to address small sample sizes and missing data, while creating predictive models that can be readily deployed to assist treatment in future outbreaks of EVD and other infectious diseases. By generating an ensemble of predictors instead of relying on a single model, we are able to handle situations where patient data is partially available. The prognosis app can be updated as new data become available, and we made all the computational protocols fully documented and open-sourced to encourage timely data sharing, independent validation, and development of better prediction models in outbreak response.http://europepmc.org/articles/PMC4798608?pdf=render |
spellingShingle | Andres Colubri Tom Silver Terrence Fradet Kalliroi Retzepi Ben Fry Pardis Sabeti Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients. PLoS Neglected Tropical Diseases |
title | Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients. |
title_full | Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients. |
title_fullStr | Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients. |
title_full_unstemmed | Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients. |
title_short | Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients. |
title_sort | transforming clinical data into actionable prognosis models machine learning framework and field deployable app to predict outcome of ebola patients |
url | http://europepmc.org/articles/PMC4798608?pdf=render |
work_keys_str_mv | AT andrescolubri transformingclinicaldataintoactionableprognosismodelsmachinelearningframeworkandfielddeployableapptopredictoutcomeofebolapatients AT tomsilver transformingclinicaldataintoactionableprognosismodelsmachinelearningframeworkandfielddeployableapptopredictoutcomeofebolapatients AT terrencefradet transformingclinicaldataintoactionableprognosismodelsmachinelearningframeworkandfielddeployableapptopredictoutcomeofebolapatients AT kalliroiretzepi transformingclinicaldataintoactionableprognosismodelsmachinelearningframeworkandfielddeployableapptopredictoutcomeofebolapatients AT benfry transformingclinicaldataintoactionableprognosismodelsmachinelearningframeworkandfielddeployableapptopredictoutcomeofebolapatients AT pardissabeti transformingclinicaldataintoactionableprognosismodelsmachinelearningframeworkandfielddeployableapptopredictoutcomeofebolapatients |