An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.

The goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured n...

Full description

Bibliographic Details
Main Authors: S Scott Graham, Savannah Shifflet, Maaz Amjad, Kasey Claborn
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-01-01
Series:PLoS ONE
Online Access:https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable
_version_ 1797315955934625792
author S Scott Graham
Savannah Shifflet
Maaz Amjad
Kasey Claborn
author_facet S Scott Graham
Savannah Shifflet
Maaz Amjad
Kasey Claborn
author_sort S Scott Graham
collection DOAJ
description The goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured narrative data. Opioid overdose annotations were provided by two harm reduction paramedics and two supporting annotators trained to reliably match expert annotations. Candidate feature engineering techniques included term frequency-inverse document frequency (TF-IDF), a highly performant approach to concept vectorization, and a custom approach based on the count of empirically-identified keywords. Each feature set was trained using four model architectures: generalized linear model (GLM), Naïve Bayes, neural network, and Extreme Gradient Boost (XGBoost). Ensembles of trained models were also evaluated. The custom feature models were also assessed for variable importance to aid interpretation. Models trained using TF-IDF feature engineering ranged from AUROC = 0.59 (95% CI: 0.53-0.66) for the Naïve Bayes to AUROC = 0.76 (95% CI: 0.71-0.81) for the neural network. Models trained using concept vectorization features ranged from AUROC = 0.83 (95% 0.78-0.88)for the Naïve Bayes to AUROC = 0.89 (95% CI: 0.85-0.94) for the ensemble. Models trained using custom features were the most performant, with benchmarks ranging from AUROC = 0.92 (95% CI: 0.88-0.95) with the GLM to 0.93 (95% CI: 0.90-0.96) for the ensemble. The custom features model achieved positive predictive values (PPV) ranging for 80 to 100%, which represent substantial improvements over previously published EMS encounter opioid overdose classifiers. The application of this approach to county EMS data can productively inform local and targeted harm reduction initiatives.
first_indexed 2024-03-08T03:11:21Z
format Article
id doaj.art-f9c5312d072c4148b09d1149d5b2181f
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-03-08T03:11:21Z
publishDate 2024-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-f9c5312d072c4148b09d1149d5b2181f2024-02-13T05:33:59ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-01191e029217010.1371/journal.pone.0292170An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.S Scott GrahamSavannah ShiffletMaaz AmjadKasey ClabornThe goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured narrative data. Opioid overdose annotations were provided by two harm reduction paramedics and two supporting annotators trained to reliably match expert annotations. Candidate feature engineering techniques included term frequency-inverse document frequency (TF-IDF), a highly performant approach to concept vectorization, and a custom approach based on the count of empirically-identified keywords. Each feature set was trained using four model architectures: generalized linear model (GLM), Naïve Bayes, neural network, and Extreme Gradient Boost (XGBoost). Ensembles of trained models were also evaluated. The custom feature models were also assessed for variable importance to aid interpretation. Models trained using TF-IDF feature engineering ranged from AUROC = 0.59 (95% CI: 0.53-0.66) for the Naïve Bayes to AUROC = 0.76 (95% CI: 0.71-0.81) for the neural network. Models trained using concept vectorization features ranged from AUROC = 0.83 (95% 0.78-0.88)for the Naïve Bayes to AUROC = 0.89 (95% CI: 0.85-0.94) for the ensemble. Models trained using custom features were the most performant, with benchmarks ranging from AUROC = 0.92 (95% CI: 0.88-0.95) with the GLM to 0.93 (95% CI: 0.90-0.96) for the ensemble. The custom features model achieved positive predictive values (PPV) ranging for 80 to 100%, which represent substantial improvements over previously published EMS encounter opioid overdose classifiers. The application of this approach to county EMS data can productively inform local and targeted harm reduction initiatives.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable
spellingShingle S Scott Graham
Savannah Shifflet
Maaz Amjad
Kasey Claborn
An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
PLoS ONE
title An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_full An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_fullStr An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_full_unstemmed An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_short An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_sort interpretable machine learning framework for opioid overdose surveillance from emergency medical services records
url https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable
work_keys_str_mv AT sscottgraham aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords
AT savannahshifflet aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords
AT maazamjad aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords
AT kaseyclaborn aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords
AT sscottgraham interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords
AT savannahshifflet interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords
AT maazamjad interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords
AT kaseyclaborn interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords