Using multiclass classification to automate the identification of patient safety incident reports by type and severity

Abstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incid...

Full description

Bibliographic Details
Main Authors: Ying Wang, Enrico Coiera, William Runciman, Farah Magrabi
Format: Article
Language:English
Published: BMC 2017-06-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12911-017-0483-8
_version_ 1819114171603615744
author Ying Wang
Enrico Coiera
William Runciman
Farah Magrabi
author_facet Ying Wang
Enrico Coiera
William Runciman
Farah Magrabi
author_sort Ying Wang
collection DOAJ
description Abstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incident reports in a timely manner is limited by the sheer volumes of data collected. In this study, we aim to evaluate the feasibility of using multiclass classification to automate the identification of patient safety incidents in hospitals. Methods Text based classifiers were applied to identify 10 incident types and 4 severity levels. Using the one-versus-one (OvsO) and one-versus-all (OvsA) ensemble strategies, we evaluated regularized logistic regression, linear support vector machine (SVM) and SVM with a radial-basis function (RBF) kernel. Classifiers were trained and tested with “balanced” datasets (n_ Type  = 2860, n_ SeverityLevel  = 1160) from a state-wide incident reporting system. Testing was also undertaken with imbalanced “stratified” datasets (n_ Type  = 6000, n_ SeverityLevel =5950) from the state-wide system and an independent hospital reporting system. Classifier performance was evaluated using a confusion matrix, as well as F-score, precision and recall. Results The most effective combination was a OvsO ensemble of binary SVM RBF classifiers with binary count feature extraction. For incident type, classifiers performed well on balanced and stratified datasets (F-score: 78.3, 73.9%), but were worse on independent datasets (68.5%). Reports about falls, medications, pressure injury, aggression and blood products were identified with high recall and precision. “Documentation” was the hardest type to identify. For severity level, F-score for severity assessment code (SAC) 1 (extreme risk) was 87.3 and 64% for SAC4 (low risk) on balanced data. With stratified data, high recall was achieved for SAC1 (82.8–84%) but precision was poor (6.8–11.2%). High risk incidents (SAC2) were confused with medium risk incidents (SAC3). Conclusions Binary classifier ensembles appear to be a feasible method for identifying incidents by type and severity level. Automated identification should enable safety problems to be detected and addressed in a more timely manner. Multi-label classifiers may be necessary for reports that relate to more than one incident type.
first_indexed 2024-12-22T04:41:04Z
format Article
id doaj.art-221df449d89246eea783f0a9c79ef13a
institution Directory Open Access Journal
issn 1472-6947
language English
last_indexed 2024-12-22T04:41:04Z
publishDate 2017-06-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj.art-221df449d89246eea783f0a9c79ef13a2022-12-21T18:38:45ZengBMCBMC Medical Informatics and Decision Making1472-69472017-06-0117111210.1186/s12911-017-0483-8Using multiclass classification to automate the identification of patient safety incident reports by type and severityYing Wang0Enrico Coiera1William Runciman2Farah Magrabi3Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityCentre for Population Health Research, Division of Health Sciences, University of South AustraliaCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityAbstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incident reports in a timely manner is limited by the sheer volumes of data collected. In this study, we aim to evaluate the feasibility of using multiclass classification to automate the identification of patient safety incidents in hospitals. Methods Text based classifiers were applied to identify 10 incident types and 4 severity levels. Using the one-versus-one (OvsO) and one-versus-all (OvsA) ensemble strategies, we evaluated regularized logistic regression, linear support vector machine (SVM) and SVM with a radial-basis function (RBF) kernel. Classifiers were trained and tested with “balanced” datasets (n_ Type  = 2860, n_ SeverityLevel  = 1160) from a state-wide incident reporting system. Testing was also undertaken with imbalanced “stratified” datasets (n_ Type  = 6000, n_ SeverityLevel =5950) from the state-wide system and an independent hospital reporting system. Classifier performance was evaluated using a confusion matrix, as well as F-score, precision and recall. Results The most effective combination was a OvsO ensemble of binary SVM RBF classifiers with binary count feature extraction. For incident type, classifiers performed well on balanced and stratified datasets (F-score: 78.3, 73.9%), but were worse on independent datasets (68.5%). Reports about falls, medications, pressure injury, aggression and blood products were identified with high recall and precision. “Documentation” was the hardest type to identify. For severity level, F-score for severity assessment code (SAC) 1 (extreme risk) was 87.3 and 64% for SAC4 (low risk) on balanced data. With stratified data, high recall was achieved for SAC1 (82.8–84%) but precision was poor (6.8–11.2%). High risk incidents (SAC2) were confused with medium risk incidents (SAC3). Conclusions Binary classifier ensembles appear to be a feasible method for identifying incidents by type and severity level. Automated identification should enable safety problems to be detected and addressed in a more timely manner. Multi-label classifiers may be necessary for reports that relate to more than one incident type.http://link.springer.com/article/10.1186/s12911-017-0483-8Machine learningPatient safetyText miningIncident reportingMedical informatics
spellingShingle Ying Wang
Enrico Coiera
William Runciman
Farah Magrabi
Using multiclass classification to automate the identification of patient safety incident reports by type and severity
BMC Medical Informatics and Decision Making
Machine learning
Patient safety
Text mining
Incident reporting
Medical informatics
title Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_full Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_fullStr Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_full_unstemmed Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_short Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_sort using multiclass classification to automate the identification of patient safety incident reports by type and severity
topic Machine learning
Patient safety
Text mining
Incident reporting
Medical informatics
url http://link.springer.com/article/10.1186/s12911-017-0483-8
work_keys_str_mv AT yingwang usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity
AT enricocoiera usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity
AT williamrunciman usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity
AT farahmagrabi usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity