Summary: | Objective To explore the risk factors for the prognosis of patients with aneurysmal subarachnoid hemorrhage (aSAH) after clipping, and to construct a predictive model based on machine learning algorithms to guide early identification of high risk patients. Methods A total of 182 patients with aSAH who underwent clipping in Tianjin Huanhu Hospital from October 2020 to July 2021 were reviewed. According to the ratio of 7∶3, all the data were randomly divided into training set (to construct the prediction model) and test set (to evaluate the prediction model). Synthetic minority oversampling technique (SMOTE) was used to deal with imbalance data. Recursive feature elimination method, Spearman rank correlation analysis and XGBoost feature importance analysis were used to select the optimal variables. Logistic regression (LR), random forest (RF), support vector machine (SVM), decision tree (DT), K near neighbor (KNN) and naive Bayesian (NB) algorithms based on machine learning were used to construct a prediction model. Receiver operating characteristic (ROC) curve was plotted and the area under the curve (AUC) was calculated, as well as accuracy, precision, recall and F1 values. Results All 182 patients were randomly divided into a training set of 127 cases according to the ratio of 7∶3, including 103 cases with good prognosis [Glasgow Outcome Scale (GOS) grade 4-5] and 24 cases with poor prognosis (GOS grade 1-3). The data was balanced by generating 79 cases of poor prognosis by SMOTE technique (103 cases of good prognosis and 103 cases of poor prognosis). The test set consisted of 55 cases, including 44 cases with good prognosis and 11 cases with poor prognosis. A total of 17 optimal features were obtained by feature selection and feature importance analysis, the number of aneurysms, alkaline phosphatase, creatinine, application of lysine, sodium heparin and nitroglycerin tend to be positive correlated with good prognosis, while age, Hunt⁃Hess score, mature neutrophil count, serum sodium, uric acid, total bilirubin, basophilic granulocyte basophil count, creatine kinase, application of furosemide, human albumin, and length of hospital stay tend to be negative correlated with good prognosis. The AUC of LR model was 0.75±0.08 (95%CI: 0.615-0.857, P=0.001), an accuracy of 0.764, a precision of 0.919, a recall of 0.773, and an F1 value of 0.840; the RF model was 0.57±0.08 (95%CI: 0.428-0.701, P=0.283), an accuracy of 0.745, a precision of 0.826, a recall of 0.864, and an F1 value of 0.845; the SVM model was 0.65±0.08 (95%CI: 0.507-0.772, P=0.034), an accuracy of 0.764, a precision of 0.860, a recall of 0.841, an F1 value of 0.850; the DT model was 0.61±0.09 (95%CI: 0.473-0.742, P=0.135), an accuracy of 0.709, a precision of 0.850, a recall of 0.773, an F1 value of 0.810; the KNN model was 0.66±0.08 (95%CI: 0.519-0.782, P=0.060), an accuracy of 0.618, a precision of 0.897, a recall of 0.591, and F1 value of 0.712; and the NB model was 0.56±0.08 (95%CI: 0.417-0.691, P=0.458), an accuracy of 0.673, a precision of 0.825, a recall of 0.750, and an F1 value of 0.786. In particular, the LR model has the best prediction performance (P<0.05, for all). Conclusions Machine learning algorithms performed well in predicting the prognosis of aSAH clipping, among which the LR model had the best prediction performance and could be used for preoperative prediction to help neurosurgeons make better clinical decisions.
|