Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh.

Accurate predictive time series modelling is important in public health planning and response during the emergence of a novel pandemic. Therefore, the aims of the study are three-fold: (a) to model the overall trend of COVID-19 confirmed cases and deaths in Bangladesh; (b) to generate a short-term f...

Full description

Bibliographic Details
Main Authors: Md Siddikur Rahman, Arman Hossain Chowdhury, Miftahuzzannat Amrin
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-01-01
Series:PLOS Global Public Health
Online Access:https://doi.org/10.1371/journal.pgph.0000495
_version_ 1797696814506311680
author Md Siddikur Rahman
Arman Hossain Chowdhury
Miftahuzzannat Amrin
author_facet Md Siddikur Rahman
Arman Hossain Chowdhury
Miftahuzzannat Amrin
author_sort Md Siddikur Rahman
collection DOAJ
description Accurate predictive time series modelling is important in public health planning and response during the emergence of a novel pandemic. Therefore, the aims of the study are three-fold: (a) to model the overall trend of COVID-19 confirmed cases and deaths in Bangladesh; (b) to generate a short-term forecast of 8 weeks of COVID-19 cases and deaths; (c) to compare the predictive accuracy of the Autoregressive Integrated Moving Average (ARIMA) and eXtreme Gradient Boosting (XGBoost) for precise modelling of non-linear features and seasonal trends of the time series. The data were collected from the onset of the epidemic in Bangladesh from the Directorate General of Health Service (DGHS) and Institute of Epidemiology, Disease Control and Research (IEDCR). The daily confirmed cases and deaths of COVID-19 of 633 days in Bangladesh were divided into several training and test sets. The ARIMA and XGBoost models were established using those training data, and the test sets were used to evaluate each model's ability to forecast and finally averaged all the predictive performances to choose the best model. The predictive accuracy of the models was assessed using the mean absolute error (MAE), mean percentage error (MPE), root mean square error (RMSE) and mean absolute percentage error (MAPE). The findings reveal the existence of a nonlinear trend and weekly seasonality in the dataset. The average error measures of the ARIMA model for both COVID-19 confirmed cases and deaths were lower than XGBoost model. Hence, in our study, the ARIMA model performed better than the XGBoost model in predicting COVID-19 confirmed cases and deaths in Bangladesh. The suggested prediction model might play a critical role in estimating the spread of a novel pandemic in Bangladesh and similar countries.
first_indexed 2024-03-12T03:31:34Z
format Article
id doaj.art-2f01304e6aef4694bd0f61989bfcfbc1
institution Directory Open Access Journal
issn 2767-3375
language English
last_indexed 2024-03-12T03:31:34Z
publishDate 2022-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLOS Global Public Health
spelling doaj.art-2f01304e6aef4694bd0f61989bfcfbc12023-09-03T13:26:20ZengPublic Library of Science (PLoS)PLOS Global Public Health2767-33752022-01-0125e000049510.1371/journal.pgph.0000495Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh.Md Siddikur RahmanArman Hossain ChowdhuryMiftahuzzannat AmrinAccurate predictive time series modelling is important in public health planning and response during the emergence of a novel pandemic. Therefore, the aims of the study are three-fold: (a) to model the overall trend of COVID-19 confirmed cases and deaths in Bangladesh; (b) to generate a short-term forecast of 8 weeks of COVID-19 cases and deaths; (c) to compare the predictive accuracy of the Autoregressive Integrated Moving Average (ARIMA) and eXtreme Gradient Boosting (XGBoost) for precise modelling of non-linear features and seasonal trends of the time series. The data were collected from the onset of the epidemic in Bangladesh from the Directorate General of Health Service (DGHS) and Institute of Epidemiology, Disease Control and Research (IEDCR). The daily confirmed cases and deaths of COVID-19 of 633 days in Bangladesh were divided into several training and test sets. The ARIMA and XGBoost models were established using those training data, and the test sets were used to evaluate each model's ability to forecast and finally averaged all the predictive performances to choose the best model. The predictive accuracy of the models was assessed using the mean absolute error (MAE), mean percentage error (MPE), root mean square error (RMSE) and mean absolute percentage error (MAPE). The findings reveal the existence of a nonlinear trend and weekly seasonality in the dataset. The average error measures of the ARIMA model for both COVID-19 confirmed cases and deaths were lower than XGBoost model. Hence, in our study, the ARIMA model performed better than the XGBoost model in predicting COVID-19 confirmed cases and deaths in Bangladesh. The suggested prediction model might play a critical role in estimating the spread of a novel pandemic in Bangladesh and similar countries.https://doi.org/10.1371/journal.pgph.0000495
spellingShingle Md Siddikur Rahman
Arman Hossain Chowdhury
Miftahuzzannat Amrin
Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh.
PLOS Global Public Health
title Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh.
title_full Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh.
title_fullStr Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh.
title_full_unstemmed Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh.
title_short Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh.
title_sort accuracy comparison of arima and xgboost forecasting models in predicting the incidence of covid 19 in bangladesh
url https://doi.org/10.1371/journal.pgph.0000495
work_keys_str_mv AT mdsiddikurrahman accuracycomparisonofarimaandxgboostforecastingmodelsinpredictingtheincidenceofcovid19inbangladesh
AT armanhossainchowdhury accuracycomparisonofarimaandxgboostforecastingmodelsinpredictingtheincidenceofcovid19inbangladesh
AT miftahuzzannatamrin accuracycomparisonofarimaandxgboostforecastingmodelsinpredictingtheincidenceofcovid19inbangladesh