Decision analysis framework for predicting no-shows to appointments using machine learning algorithms

Abstract Background No-show to medical appointments has significant adverse effects on healthcare systems and their clients. Using machine learning to predict no-shows allows managers to implement strategies such as overbooking and reminders targeting patients most likely to miss appointments, optim...

Full description

Bibliographic Details
Main Authors:	Carolina Deina, Flavio S. Fogliatto, Giovani J. C. da Silveira, Michel J. Anzanello
Format:	Article
Language:	English
Published:	BMC 2024-01-01
Series:	BMC Health Services Research
Subjects:	Missed appointments Healthcare environments Imbalanced dataset Classification algorithms Resampling techniques Machine learning
Online Access:	https://doi.org/10.1186/s12913-023-10418-6

_version_	1827388646269911040
author	Carolina Deina Flavio S. Fogliatto Giovani J. C. da Silveira Michel J. Anzanello
author_facet	Carolina Deina Flavio S. Fogliatto Giovani J. C. da Silveira Michel J. Anzanello
author_sort	Carolina Deina
collection	DOAJ
description	Abstract Background No-show to medical appointments has significant adverse effects on healthcare systems and their clients. Using machine learning to predict no-shows allows managers to implement strategies such as overbooking and reminders targeting patients most likely to miss appointments, optimizing the use of resources. Methods In this study, we proposed a detailed analytical framework for predicting no-shows while addressing imbalanced datasets. The framework includes a novel use of z-fold cross-validation performed twice during the modeling process to improve model robustness and generalization. We also introduce Symbolic Regression (SR) as a classification algorithm and Instance Hardness Threshold (IHT) as a resampling technique and compared their performance with that of other classification algorithms, such as K-Nearest Neighbors (KNN) and Support Vector Machine (SVM), and resampling techniques, such as Random under Sampling (RUS), Synthetic Minority Oversampling Technique (SMOTE) and NearMiss-1. We validated the framework using two attendance datasets from Brazilian hospitals with no-show rates of 6.65% and 19.03%. Results From the academic perspective, our study is the first to propose using SR and IHT to predict the no-show of patients. Our findings indicate that SR and IHT presented superior performances compared to other techniques, particularly IHT, which excelled when combined with all classification algorithms and led to low variability in performance metrics results. Our results also outperformed sensitivity outcomes reported in the literature, with values above 0.94 for both datasets. Conclusion This is the first study to use SR and IHT methods to predict patient no-shows and the first to propose performing z-fold cross-validation twice. Our study highlights the importance of avoiding relying on few validation runs for imbalanced datasets as it may lead to biased results and inadequate analysis of the generalization and stability of the models obtained during the training stage.
first_indexed	2024-03-08T16:22:51Z
format	Article
id	doaj.art-eda8c2b9fd644d93be1e047e2e149d42
institution	Directory Open Access Journal
issn	1472-6963
language	English
last_indexed	2024-03-08T16:22:51Z
publishDate	2024-01-01
publisher	BMC
record_format	Article
series	BMC Health Services Research
spelling	doaj.art-eda8c2b9fd644d93be1e047e2e149d422024-01-07T12:17:46ZengBMCBMC Health Services Research1472-69632024-01-0124111710.1186/s12913-023-10418-6Decision analysis framework for predicting no-shows to appointments using machine learning algorithmsCarolina Deina0Flavio S. Fogliatto1Giovani J. C. da Silveira2Michel J. Anzanello3Department of Industrial Engineering, Federal University of Rio Grande do SulDepartment of Industrial Engineering, Federal University of Rio Grande do SulHaskayne School of Business, University of CalgaryDepartment of Industrial Engineering, Federal University of Rio Grande do SulAbstract Background No-show to medical appointments has significant adverse effects on healthcare systems and their clients. Using machine learning to predict no-shows allows managers to implement strategies such as overbooking and reminders targeting patients most likely to miss appointments, optimizing the use of resources. Methods In this study, we proposed a detailed analytical framework for predicting no-shows while addressing imbalanced datasets. The framework includes a novel use of z-fold cross-validation performed twice during the modeling process to improve model robustness and generalization. We also introduce Symbolic Regression (SR) as a classification algorithm and Instance Hardness Threshold (IHT) as a resampling technique and compared their performance with that of other classification algorithms, such as K-Nearest Neighbors (KNN) and Support Vector Machine (SVM), and resampling techniques, such as Random under Sampling (RUS), Synthetic Minority Oversampling Technique (SMOTE) and NearMiss-1. We validated the framework using two attendance datasets from Brazilian hospitals with no-show rates of 6.65% and 19.03%. Results From the academic perspective, our study is the first to propose using SR and IHT to predict the no-show of patients. Our findings indicate that SR and IHT presented superior performances compared to other techniques, particularly IHT, which excelled when combined with all classification algorithms and led to low variability in performance metrics results. Our results also outperformed sensitivity outcomes reported in the literature, with values above 0.94 for both datasets. Conclusion This is the first study to use SR and IHT methods to predict patient no-shows and the first to propose performing z-fold cross-validation twice. Our study highlights the importance of avoiding relying on few validation runs for imbalanced datasets as it may lead to biased results and inadequate analysis of the generalization and stability of the models obtained during the training stage.https://doi.org/10.1186/s12913-023-10418-6Missed appointmentsHealthcare environmentsImbalanced datasetClassification algorithmsResampling techniquesMachine learning
spellingShingle	Carolina Deina Flavio S. Fogliatto Giovani J. C. da Silveira Michel J. Anzanello Decision analysis framework for predicting no-shows to appointments using machine learning algorithms BMC Health Services Research Missed appointments Healthcare environments Imbalanced dataset Classification algorithms Resampling techniques Machine learning
title	Decision analysis framework for predicting no-shows to appointments using machine learning algorithms
title_full	Decision analysis framework for predicting no-shows to appointments using machine learning algorithms
title_fullStr	Decision analysis framework for predicting no-shows to appointments using machine learning algorithms
title_full_unstemmed	Decision analysis framework for predicting no-shows to appointments using machine learning algorithms
title_short	Decision analysis framework for predicting no-shows to appointments using machine learning algorithms
title_sort	decision analysis framework for predicting no shows to appointments using machine learning algorithms
topic	Missed appointments Healthcare environments Imbalanced dataset Classification algorithms Resampling techniques Machine learning
url	https://doi.org/10.1186/s12913-023-10418-6
work_keys_str_mv	AT carolinadeina decisionanalysisframeworkforpredictingnoshowstoappointmentsusingmachinelearningalgorithms AT flaviosfogliatto decisionanalysisframeworkforpredictingnoshowstoappointmentsusingmachinelearningalgorithms AT giovanijcdasilveira decisionanalysisframeworkforpredictingnoshowstoappointmentsusingmachinelearningalgorithms AT micheljanzanello decisionanalysisframeworkforpredictingnoshowstoappointmentsusingmachinelearningalgorithms

Decision analysis framework for predicting no-shows to appointments using machine learning algorithms

Similar Items