An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.
The goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured n...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2024-01-01
|
Series: | PLoS ONE |
Online Access: | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable |
_version_ | 1797315955934625792 |
---|---|
author | S Scott Graham Savannah Shifflet Maaz Amjad Kasey Claborn |
author_facet | S Scott Graham Savannah Shifflet Maaz Amjad Kasey Claborn |
author_sort | S Scott Graham |
collection | DOAJ |
description | The goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured narrative data. Opioid overdose annotations were provided by two harm reduction paramedics and two supporting annotators trained to reliably match expert annotations. Candidate feature engineering techniques included term frequency-inverse document frequency (TF-IDF), a highly performant approach to concept vectorization, and a custom approach based on the count of empirically-identified keywords. Each feature set was trained using four model architectures: generalized linear model (GLM), Naïve Bayes, neural network, and Extreme Gradient Boost (XGBoost). Ensembles of trained models were also evaluated. The custom feature models were also assessed for variable importance to aid interpretation. Models trained using TF-IDF feature engineering ranged from AUROC = 0.59 (95% CI: 0.53-0.66) for the Naïve Bayes to AUROC = 0.76 (95% CI: 0.71-0.81) for the neural network. Models trained using concept vectorization features ranged from AUROC = 0.83 (95% 0.78-0.88)for the Naïve Bayes to AUROC = 0.89 (95% CI: 0.85-0.94) for the ensemble. Models trained using custom features were the most performant, with benchmarks ranging from AUROC = 0.92 (95% CI: 0.88-0.95) with the GLM to 0.93 (95% CI: 0.90-0.96) for the ensemble. The custom features model achieved positive predictive values (PPV) ranging for 80 to 100%, which represent substantial improvements over previously published EMS encounter opioid overdose classifiers. The application of this approach to county EMS data can productively inform local and targeted harm reduction initiatives. |
first_indexed | 2024-03-08T03:11:21Z |
format | Article |
id | doaj.art-f9c5312d072c4148b09d1149d5b2181f |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-03-08T03:11:21Z |
publishDate | 2024-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-f9c5312d072c4148b09d1149d5b2181f2024-02-13T05:33:59ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-01191e029217010.1371/journal.pone.0292170An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records.S Scott GrahamSavannah ShiffletMaaz AmjadKasey ClabornThe goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured narrative data. Opioid overdose annotations were provided by two harm reduction paramedics and two supporting annotators trained to reliably match expert annotations. Candidate feature engineering techniques included term frequency-inverse document frequency (TF-IDF), a highly performant approach to concept vectorization, and a custom approach based on the count of empirically-identified keywords. Each feature set was trained using four model architectures: generalized linear model (GLM), Naïve Bayes, neural network, and Extreme Gradient Boost (XGBoost). Ensembles of trained models were also evaluated. The custom feature models were also assessed for variable importance to aid interpretation. Models trained using TF-IDF feature engineering ranged from AUROC = 0.59 (95% CI: 0.53-0.66) for the Naïve Bayes to AUROC = 0.76 (95% CI: 0.71-0.81) for the neural network. Models trained using concept vectorization features ranged from AUROC = 0.83 (95% 0.78-0.88)for the Naïve Bayes to AUROC = 0.89 (95% CI: 0.85-0.94) for the ensemble. Models trained using custom features were the most performant, with benchmarks ranging from AUROC = 0.92 (95% CI: 0.88-0.95) with the GLM to 0.93 (95% CI: 0.90-0.96) for the ensemble. The custom features model achieved positive predictive values (PPV) ranging for 80 to 100%, which represent substantial improvements over previously published EMS encounter opioid overdose classifiers. The application of this approach to county EMS data can productively inform local and targeted harm reduction initiatives.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable |
spellingShingle | S Scott Graham Savannah Shifflet Maaz Amjad Kasey Claborn An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records. PLoS ONE |
title | An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records. |
title_full | An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records. |
title_fullStr | An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records. |
title_full_unstemmed | An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records. |
title_short | An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records. |
title_sort | interpretable machine learning framework for opioid overdose surveillance from emergency medical services records |
url | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0292170&type=printable |
work_keys_str_mv | AT sscottgraham aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT savannahshifflet aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT maazamjad aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT kaseyclaborn aninterpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT sscottgraham interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT savannahshifflet interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT maazamjad interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords AT kaseyclaborn interpretablemachinelearningframeworkforopioidoverdosesurveillancefromemergencymedicalservicesrecords |