Using multiclass classification to automate the identification of patient safety incident reports by type and severity
Abstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incid...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-06-01
|
Series: | BMC Medical Informatics and Decision Making |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12911-017-0483-8 |
_version_ | 1819114171603615744 |
---|---|
author | Ying Wang Enrico Coiera William Runciman Farah Magrabi |
author_facet | Ying Wang Enrico Coiera William Runciman Farah Magrabi |
author_sort | Ying Wang |
collection | DOAJ |
description | Abstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incident reports in a timely manner is limited by the sheer volumes of data collected. In this study, we aim to evaluate the feasibility of using multiclass classification to automate the identification of patient safety incidents in hospitals. Methods Text based classifiers were applied to identify 10 incident types and 4 severity levels. Using the one-versus-one (OvsO) and one-versus-all (OvsA) ensemble strategies, we evaluated regularized logistic regression, linear support vector machine (SVM) and SVM with a radial-basis function (RBF) kernel. Classifiers were trained and tested with “balanced” datasets (n_ Type = 2860, n_ SeverityLevel = 1160) from a state-wide incident reporting system. Testing was also undertaken with imbalanced “stratified” datasets (n_ Type = 6000, n_ SeverityLevel =5950) from the state-wide system and an independent hospital reporting system. Classifier performance was evaluated using a confusion matrix, as well as F-score, precision and recall. Results The most effective combination was a OvsO ensemble of binary SVM RBF classifiers with binary count feature extraction. For incident type, classifiers performed well on balanced and stratified datasets (F-score: 78.3, 73.9%), but were worse on independent datasets (68.5%). Reports about falls, medications, pressure injury, aggression and blood products were identified with high recall and precision. “Documentation” was the hardest type to identify. For severity level, F-score for severity assessment code (SAC) 1 (extreme risk) was 87.3 and 64% for SAC4 (low risk) on balanced data. With stratified data, high recall was achieved for SAC1 (82.8–84%) but precision was poor (6.8–11.2%). High risk incidents (SAC2) were confused with medium risk incidents (SAC3). Conclusions Binary classifier ensembles appear to be a feasible method for identifying incidents by type and severity level. Automated identification should enable safety problems to be detected and addressed in a more timely manner. Multi-label classifiers may be necessary for reports that relate to more than one incident type. |
first_indexed | 2024-12-22T04:41:04Z |
format | Article |
id | doaj.art-221df449d89246eea783f0a9c79ef13a |
institution | Directory Open Access Journal |
issn | 1472-6947 |
language | English |
last_indexed | 2024-12-22T04:41:04Z |
publishDate | 2017-06-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Informatics and Decision Making |
spelling | doaj.art-221df449d89246eea783f0a9c79ef13a2022-12-21T18:38:45ZengBMCBMC Medical Informatics and Decision Making1472-69472017-06-0117111210.1186/s12911-017-0483-8Using multiclass classification to automate the identification of patient safety incident reports by type and severityYing Wang0Enrico Coiera1William Runciman2Farah Magrabi3Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityCentre for Population Health Research, Division of Health Sciences, University of South AustraliaCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityAbstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incident reports in a timely manner is limited by the sheer volumes of data collected. In this study, we aim to evaluate the feasibility of using multiclass classification to automate the identification of patient safety incidents in hospitals. Methods Text based classifiers were applied to identify 10 incident types and 4 severity levels. Using the one-versus-one (OvsO) and one-versus-all (OvsA) ensemble strategies, we evaluated regularized logistic regression, linear support vector machine (SVM) and SVM with a radial-basis function (RBF) kernel. Classifiers were trained and tested with “balanced” datasets (n_ Type = 2860, n_ SeverityLevel = 1160) from a state-wide incident reporting system. Testing was also undertaken with imbalanced “stratified” datasets (n_ Type = 6000, n_ SeverityLevel =5950) from the state-wide system and an independent hospital reporting system. Classifier performance was evaluated using a confusion matrix, as well as F-score, precision and recall. Results The most effective combination was a OvsO ensemble of binary SVM RBF classifiers with binary count feature extraction. For incident type, classifiers performed well on balanced and stratified datasets (F-score: 78.3, 73.9%), but were worse on independent datasets (68.5%). Reports about falls, medications, pressure injury, aggression and blood products were identified with high recall and precision. “Documentation” was the hardest type to identify. For severity level, F-score for severity assessment code (SAC) 1 (extreme risk) was 87.3 and 64% for SAC4 (low risk) on balanced data. With stratified data, high recall was achieved for SAC1 (82.8–84%) but precision was poor (6.8–11.2%). High risk incidents (SAC2) were confused with medium risk incidents (SAC3). Conclusions Binary classifier ensembles appear to be a feasible method for identifying incidents by type and severity level. Automated identification should enable safety problems to be detected and addressed in a more timely manner. Multi-label classifiers may be necessary for reports that relate to more than one incident type.http://link.springer.com/article/10.1186/s12911-017-0483-8Machine learningPatient safetyText miningIncident reportingMedical informatics |
spellingShingle | Ying Wang Enrico Coiera William Runciman Farah Magrabi Using multiclass classification to automate the identification of patient safety incident reports by type and severity BMC Medical Informatics and Decision Making Machine learning Patient safety Text mining Incident reporting Medical informatics |
title | Using multiclass classification to automate the identification of patient safety incident reports by type and severity |
title_full | Using multiclass classification to automate the identification of patient safety incident reports by type and severity |
title_fullStr | Using multiclass classification to automate the identification of patient safety incident reports by type and severity |
title_full_unstemmed | Using multiclass classification to automate the identification of patient safety incident reports by type and severity |
title_short | Using multiclass classification to automate the identification of patient safety incident reports by type and severity |
title_sort | using multiclass classification to automate the identification of patient safety incident reports by type and severity |
topic | Machine learning Patient safety Text mining Incident reporting Medical informatics |
url | http://link.springer.com/article/10.1186/s12911-017-0483-8 |
work_keys_str_mv | AT yingwang usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity AT enricocoiera usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity AT williamrunciman usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity AT farahmagrabi usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity |