An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.

The goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured n...

Full description

Bibliographic Details
Main Authors:	S Scott Graham, Savannah Shifflet, Maaz Amjad, Kasey Claborn
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2024-01-01
Series:	PLoS ONE
Online Access:	https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable

_version_	1797315955934625792
author	S Scott Graham Savannah Shifflet Maaz Amjad Kasey Claborn
author_facet	S Scott Graham Savannah Shifflet Maaz Amjad Kasey Claborn
author_sort	S Scott Graham
collection	DOAJ
description	The goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured narrative data. Opioid overdose annotations were provided by two harm reduction paramedics and two supporting annotators trained to reliably match expert annotations. Candidate feature engineering techniques included term frequency-inverse document frequency (TF-IDF), a highly performant approach to concept vectorization, and a custom approach based on the count of empirically-identified keywords. Each feature set was trained using four model architectures: generalized linear model (GLM), Naïve Bayes, neural network, and Extreme Gradient Boost (XGBoost). Ensembles of trained models were also evaluated. The custom feature models were also assessed for variable importance to aid interpretation. Models trained using TF-IDF feature engineering ranged from AUROC = 0.59 (95% CI: 0.53-0.66) for the Naïve Bayes to AUROC = 0.76 (95% CI: 0.71-0.81) for the neural network. Models trained using concept vectorization features ranged from AUROC = 0.83 (95% 0.78-0.88)for the Naïve Bayes to AUROC = 0.89 (95% CI: 0.85-0.94) for the ensemble. Models trained using custom features were the most performant, with benchmarks ranging from AUROC = 0.92 (95% CI: 0.88-0.95) with the GLM to 0.93 (95% CI: 0.90-0.96) for the ensemble. The custom features model achieved positive predictive values (PPV) ranging for 80 to 100%, which represent substantial improvements over previously published EMS encounter opioid overdose classifiers. The application of this approach to county EMS data can productively inform local and targeted harm reduction initiatives.
first_indexed	2024-03-08T03:11:21Z
format	Article
id	doaj.art-f9c5312d072c4148b09d1149d5b2181f
institution	Directory Open Access Journal
issn	1932-6203
language	English
last_indexed	2024-03-08T03:11:21Z
publishDate	2024-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj.art-f9c5312d072c4148b09d1149d5b2181f2024-02-13T05:33:59ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-01191e029217010.1371/journal.pone.0292170An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.S Scott GrahamSavannah ShiffletMaaz AmjadKasey ClabornThe goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured narrative data. Opioid overdose annotations were provided by two harm reduction paramedics and two supporting annotators trained to reliably match expert annotations. Candidate feature engineering techniques included term frequency-inverse document frequency (TF-IDF), a highly performant approach to concept vectorization, and a custom approach based on the count of empirically-identified keywords. Each feature set was trained using four model architectures: generalized linear model (GLM), Naïve Bayes, neural network, and Extreme Gradient Boost (XGBoost). Ensembles of trained models were also evaluated. The custom feature models were also assessed for variable importance to aid interpretation. Models trained using TF-IDF feature engineering ranged from AUROC = 0.59 (95% CI: 0.53-0.66) for the Naïve Bayes to AUROC = 0.76 (95% CI: 0.71-0.81) for the neural network. Models trained using concept vectorization features ranged from AUROC = 0.83 (95% 0.78-0.88)for the Naïve Bayes to AUROC = 0.89 (95% CI: 0.85-0.94) for the ensemble. Models trained using custom features were the most performant, with benchmarks ranging from AUROC = 0.92 (95% CI: 0.88-0.95) with the GLM to 0.93 (95% CI: 0.90-0.96) for the ensemble. The custom features model achieved positive predictive values (PPV) ranging for 80 to 100%, which represent substantial improvements over previously published EMS encounter opioid overdose classifiers. The application of this approach to county EMS data can productively inform local and targeted harm reduction initiatives.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable
spellingShingle	S Scott Graham Savannah Shifflet Maaz Amjad Kasey Claborn An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records. PLoS ONE
title	An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_full	An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_fullStr	An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_full_unstemmed	An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_short	An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
title_sort	interpretable machine learning framework for opioid overdose surveillance from emergency medical services records
url	https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable
work_keys_str_mv	AT sscottgraham aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT savannahshifflet aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT maazamjad aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT kaseyclaborn aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT sscottgraham interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT savannahshifflet interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT maazamjad interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT kaseyclaborn interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords

An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.

Similar Items