A study on predicting the length of hospital stay for Chinese patients with ischemic stroke based on the XGBoost algorithm

Abstract Background The incidence of stroke is a challenge in China, as stroke imposes a heavy burden on families, national health services, social services, and the economy. The length of hospital stay (LOS) is an essential indicator of utilization of medical services and is usually used to assess...

Full description

Bibliographic Details
Main Authors: Rui Chen, Shengfa Zhang, Jie Li, Dongwei Guo, Weijun Zhang, Xiaoying Wang, Donghua Tian, Zhiyong Qu, Xiaohua Wang
Format: Article
Language:English
Published: BMC 2023-03-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-023-02140-4
_version_ 1797859869025370112
author Rui Chen
Shengfa Zhang
Jie Li
Dongwei Guo
Weijun Zhang
Xiaoying Wang
Donghua Tian
Zhiyong Qu
Xiaohua Wang
author_facet Rui Chen
Shengfa Zhang
Jie Li
Dongwei Guo
Weijun Zhang
Xiaoying Wang
Donghua Tian
Zhiyong Qu
Xiaohua Wang
author_sort Rui Chen
collection DOAJ
description Abstract Background The incidence of stroke is a challenge in China, as stroke imposes a heavy burden on families, national health services, social services, and the economy. The length of hospital stay (LOS) is an essential indicator of utilization of medical services and is usually used to assess the efficiency of hospital management and patient quality of care. This study established a prediction model based on a machine learning algorithm to predict ischemic stroke patients’ LOS. Methods A total of 18,195 ischemic stroke patients’ electronic medical records and 28 attributes were extracted from electronic medical records in a large comprehensive hospital in China. The prediction of LOS was regarded as a multi classification problem, and LOS was divided into three categories: 1–7 days, 8–14 days and more than 14 days. After preprocessing the data and feature selection, the XGBoost algorithm was used to build a machine learning model. Ten fold cross-validation was used for model validation. The accuracy (ACC), recall rate (RE) and F1 measure were used to evaluate the performance of the prediction model of LOS of ischemic stroke patients. Finally, the XGBoost algorithm was used to identify and remove irrelevant features by ranking all attributes based on feature importance. Results Compared with the naive Bayesian algorithm, logistic region algorithm, decision tree classifier algorithm and ADaBoost classifier algorithm, the XGBoot algorithm has higher ACC, RE and F1 measure. The average ACC, RE and F1 measure were 0.89, 0.89 and 0.89 under the 10-fold cross-validation. According to the analysis of the importance of features, the LOS of ischemic stroke patients was affected by demographic characteristics, past medical history, admission examination features, and operation characteristics. Finally, the features in terms of hemiplegia aphasia, MRS, NIHSS, TIA, Operation or not, coma index etc. were found to be the top features in importance in predicting the LOS of ischemic stroke patients. Conclusions The XGBoost algorithm was an appropriate machine learning method for predicting the LOS of patients with ischemic stroke. Based on the prediction model, an intelligent medical management prediction system could be developed to predict the LOS based on ischemic stroke patients’ electronic medical records.
first_indexed 2024-04-09T21:37:38Z
format Article
id doaj.art-2626f61a05b14d82ad702f33b3ffabba
institution Directory Open Access Journal
issn 1472-6947
language English
last_indexed 2024-04-09T21:37:38Z
publishDate 2023-03-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj.art-2626f61a05b14d82ad702f33b3ffabba2023-03-26T11:12:33ZengBMCBMC Medical Informatics and Decision Making1472-69472023-03-0123111010.1186/s12911-023-02140-4A study on predicting the length of hospital stay for Chinese patients with ischemic stroke based on the XGBoost algorithmRui Chen0Shengfa Zhang1Jie Li2Dongwei Guo3Weijun Zhang4Xiaoying Wang5Donghua Tian6Zhiyong Qu7Xiaohua Wang8Refined Management Office, Cangzhou Central HospitalNational Population Heath Data Center, Chinese Academy of Medical Sciences and Peking Union Medical CollegeSchool of Economics and Management, Hebei University of TechnologySchool of Economics and Management, Hebei University of TechnologySchool of Social Development and Public Policy, Beijing Normal UniversitySchool of Social Development and Public Policy, Beijing Normal UniversitySchool of Social Development and Public Policy, Beijing Normal UniversitySchool of Social Development and Public Policy, Beijing Normal UniversitySchool of Social Development and Public Policy, Beijing Normal UniversityAbstract Background The incidence of stroke is a challenge in China, as stroke imposes a heavy burden on families, national health services, social services, and the economy. The length of hospital stay (LOS) is an essential indicator of utilization of medical services and is usually used to assess the efficiency of hospital management and patient quality of care. This study established a prediction model based on a machine learning algorithm to predict ischemic stroke patients’ LOS. Methods A total of 18,195 ischemic stroke patients’ electronic medical records and 28 attributes were extracted from electronic medical records in a large comprehensive hospital in China. The prediction of LOS was regarded as a multi classification problem, and LOS was divided into three categories: 1–7 days, 8–14 days and more than 14 days. After preprocessing the data and feature selection, the XGBoost algorithm was used to build a machine learning model. Ten fold cross-validation was used for model validation. The accuracy (ACC), recall rate (RE) and F1 measure were used to evaluate the performance of the prediction model of LOS of ischemic stroke patients. Finally, the XGBoost algorithm was used to identify and remove irrelevant features by ranking all attributes based on feature importance. Results Compared with the naive Bayesian algorithm, logistic region algorithm, decision tree classifier algorithm and ADaBoost classifier algorithm, the XGBoot algorithm has higher ACC, RE and F1 measure. The average ACC, RE and F1 measure were 0.89, 0.89 and 0.89 under the 10-fold cross-validation. According to the analysis of the importance of features, the LOS of ischemic stroke patients was affected by demographic characteristics, past medical history, admission examination features, and operation characteristics. Finally, the features in terms of hemiplegia aphasia, MRS, NIHSS, TIA, Operation or not, coma index etc. were found to be the top features in importance in predicting the LOS of ischemic stroke patients. Conclusions The XGBoost algorithm was an appropriate machine learning method for predicting the LOS of patients with ischemic stroke. Based on the prediction model, an intelligent medical management prediction system could be developed to predict the LOS based on ischemic stroke patients’ electronic medical records.https://doi.org/10.1186/s12911-023-02140-4Ischemic strokeXGBoost algorithmLength of hospital stay (LOS)Machine learning (ML) model
spellingShingle Rui Chen
Shengfa Zhang
Jie Li
Dongwei Guo
Weijun Zhang
Xiaoying Wang
Donghua Tian
Zhiyong Qu
Xiaohua Wang
A study on predicting the length of hospital stay for Chinese patients with ischemic stroke based on the XGBoost algorithm
BMC Medical Informatics and Decision Making
Ischemic stroke
XGBoost algorithm
Length of hospital stay (LOS)
Machine learning (ML) model
title A study on predicting the length of hospital stay for Chinese patients with ischemic stroke based on the XGBoost algorithm
title_full A study on predicting the length of hospital stay for Chinese patients with ischemic stroke based on the XGBoost algorithm
title_fullStr A study on predicting the length of hospital stay for Chinese patients with ischemic stroke based on the XGBoost algorithm
title_full_unstemmed A study on predicting the length of hospital stay for Chinese patients with ischemic stroke based on the XGBoost algorithm
title_short A study on predicting the length of hospital stay for Chinese patients with ischemic stroke based on the XGBoost algorithm
title_sort study on predicting the length of hospital stay for chinese patients with ischemic stroke based on the xgboost algorithm
topic Ischemic stroke
XGBoost algorithm
Length of hospital stay (LOS)
Machine learning (ML) model
url https://doi.org/10.1186/s12911-023-02140-4
work_keys_str_mv AT ruichen astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT shengfazhang astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT jieli astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT dongweiguo astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT weijunzhang astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT xiaoyingwang astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT donghuatian astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT zhiyongqu astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT xiaohuawang astudyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT ruichen studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT shengfazhang studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT jieli studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT dongweiguo studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT weijunzhang studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT xiaoyingwang studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT donghuatian studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT zhiyongqu studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm
AT xiaohuawang studyonpredictingthelengthofhospitalstayforchinesepatientswithischemicstrokebasedonthexgboostalgorithm