Summary: | Early detection of sepsis is key to ensure timely clinical intervention. Since very few end-toend pipelines are publicly available, fair comparisons between methodologies are difficult if
not impossible. Progress is further limited by discrepancies in the reconstruction of sepsis
onset time.
This retrospective cohort study highlights the variation in performance of predictive
models under three subtly different interpretations of sepsis onset from the sepsis-III
definition and compares this against inter-model differences. The models are chosen to
cover tree-based, deep learning, and survival analysis methods.
Using the MIMIC-III database, between 867 and 2178 intensive care unit admissions with
sepsis were identified, depending on the onset definition. We show that model performance
can be more sensitive to differences in the definition of sepsis onset than to the model itself.
Given a fixed sepsis definition, the best performing method had a gain of 1–5% in the area
under the receiver operating characteristic (AUROC). However, the choice of onset time can
cause a greater effect, with variation of 0–6% in AUROC.
We illustrate that misleading conclusions can be drawn if models are compared without
consideration of the sepsis definition used which emphasizes the need for a standardized
definition for sepsis onset.
|