Using multiclass classification to automate the identification of patient safety incident reports by type and severity

Abstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incid...

Full description

Bibliographic Details
Main Authors:	Ying Wang, Enrico Coiera, William Runciman, Farah Magrabi
Format:	Article
Language:	English
Published:	BMC 2017-06-01
Series:	BMC Medical Informatics and Decision Making
Subjects:	Machine learning Patient safety Text mining Incident reporting Medical informatics
Online Access:	http://link.springer.com/article/10.1186/s12911-017-0483-8

_version_	1819114171603615744
author	Ying Wang Enrico Coiera William Runciman Farah Magrabi
author_facet	Ying Wang Enrico Coiera William Runciman Farah Magrabi
author_sort	Ying Wang
collection	DOAJ
description	Abstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incident reports in a timely manner is limited by the sheer volumes of data collected. In this study, we aim to evaluate the feasibility of using multiclass classification to automate the identification of patient safety incidents in hospitals. Methods Text based classifiers were applied to identify 10 incident types and 4 severity levels. Using the one-versus-one (OvsO) and one-versus-all (OvsA) ensemble strategies, we evaluated regularized logistic regression, linear support vector machine (SVM) and SVM with a radial-basis function (RBF) kernel. Classifiers were trained and tested with “balanced” datasets (n_ Type = 2860, n_ SeverityLevel = 1160) from a state-wide incident reporting system. Testing was also undertaken with imbalanced “stratified” datasets (n_ Type = 6000, n_ SeverityLevel =5950) from the state-wide system and an independent hospital reporting system. Classifier performance was evaluated using a confusion matrix, as well as F-score, precision and recall. Results The most effective combination was a OvsO ensemble of binary SVM RBF classifiers with binary count feature extraction. For incident type, classifiers performed well on balanced and stratified datasets (F-score: 78.3, 73.9%), but were worse on independent datasets (68.5%). Reports about falls, medications, pressure injury, aggression and blood products were identified with high recall and precision. “Documentation” was the hardest type to identify. For severity level, F-score for severity assessment code (SAC) 1 (extreme risk) was 87.3 and 64% for SAC4 (low risk) on balanced data. With stratified data, high recall was achieved for SAC1 (82.8–84%) but precision was poor (6.8–11.2%). High risk incidents (SAC2) were confused with medium risk incidents (SAC3). Conclusions Binary classifier ensembles appear to be a feasible method for identifying incidents by type and severity level. Automated identification should enable safety problems to be detected and addressed in a more timely manner. Multi-label classifiers may be necessary for reports that relate to more than one incident type.
first_indexed	2024-12-22T04:41:04Z
format	Article
id	doaj.art-221df449d89246eea783f0a9c79ef13a
institution	Directory Open Access Journal
issn	1472-6947
language	English
last_indexed	2024-12-22T04:41:04Z
publishDate	2017-06-01
publisher	BMC
record_format	Article
series	BMC Medical Informatics and Decision Making
spelling	doaj.art-221df449d89246eea783f0a9c79ef13a2022-12-21T18:38:45ZengBMCBMC Medical Informatics and Decision Making1472-69472017-06-0117111210.1186/s12911-017-0483-8Using multiclass classification to automate the identification of patient safety incident reports by type and severityYing Wang0Enrico Coiera1William Runciman2Farah Magrabi3Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityCentre for Population Health Research, Division of Health Sciences, University of South AustraliaCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityAbstract Background Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incident reports in a timely manner is limited by the sheer volumes of data collected. In this study, we aim to evaluate the feasibility of using multiclass classification to automate the identification of patient safety incidents in hospitals. Methods Text based classifiers were applied to identify 10 incident types and 4 severity levels. Using the one-versus-one (OvsO) and one-versus-all (OvsA) ensemble strategies, we evaluated regularized logistic regression, linear support vector machine (SVM) and SVM with a radial-basis function (RBF) kernel. Classifiers were trained and tested with “balanced” datasets (n_ Type = 2860, n_ SeverityLevel = 1160) from a state-wide incident reporting system. Testing was also undertaken with imbalanced “stratified” datasets (n_ Type = 6000, n_ SeverityLevel =5950) from the state-wide system and an independent hospital reporting system. Classifier performance was evaluated using a confusion matrix, as well as F-score, precision and recall. Results The most effective combination was a OvsO ensemble of binary SVM RBF classifiers with binary count feature extraction. For incident type, classifiers performed well on balanced and stratified datasets (F-score: 78.3, 73.9%), but were worse on independent datasets (68.5%). Reports about falls, medications, pressure injury, aggression and blood products were identified with high recall and precision. “Documentation” was the hardest type to identify. For severity level, F-score for severity assessment code (SAC) 1 (extreme risk) was 87.3 and 64% for SAC4 (low risk) on balanced data. With stratified data, high recall was achieved for SAC1 (82.8–84%) but precision was poor (6.8–11.2%). High risk incidents (SAC2) were confused with medium risk incidents (SAC3). Conclusions Binary classifier ensembles appear to be a feasible method for identifying incidents by type and severity level. Automated identification should enable safety problems to be detected and addressed in a more timely manner. Multi-label classifiers may be necessary for reports that relate to more than one incident type.http://link.springer.com/article/10.1186/s12911-017-0483-8Machine learningPatient safetyText miningIncident reportingMedical informatics
spellingShingle	Ying Wang Enrico Coiera William Runciman Farah Magrabi Using multiclass classification to automate the identification of patient safety incident reports by type and severity BMC Medical Informatics and Decision Making Machine learning Patient safety Text mining Incident reporting Medical informatics
title	Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_full	Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_fullStr	Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_full_unstemmed	Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_short	Using multiclass classification to automate the identification of patient safety incident reports by type and severity
title_sort	using multiclass classification to automate the identification of patient safety incident reports by type and severity
topic	Machine learning Patient safety Text mining Incident reporting Medical informatics
url	http://link.springer.com/article/10.1186/s12911-017-0483-8
work_keys_str_mv	AT yingwang usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity AT enricocoiera usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity AT williamrunciman usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity AT farahmagrabi usingmulticlassclassificationtoautomatetheidentificationofpatientsafetyincidentreportsbytypeandseverity

Using multiclass classification to automate the identification of patient safety incident reports by type and severity

Similar Items