Modeling COVID-19 incidence with Google Trends

Infodemiologic methods could be used to enhance modeling infectious diseases. It is of interest to verify the utility of these methods using a Nigerian case study. We used Google Trends data to track COVID-19 incidences and assessed whether they could complement traditional data based solely on repo...

Full description

Bibliographic Details
Main Authors: Lateef Babatunde Amusa, Hossana Twinomurinzi, Chinedu Wilfred Okonkwo
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-09-01
Series:Frontiers in Research Metrics and Analytics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frma.2022.1003972/full
_version_ 1798033822462246912
author Lateef Babatunde Amusa
Hossana Twinomurinzi
Chinedu Wilfred Okonkwo
author_facet Lateef Babatunde Amusa
Hossana Twinomurinzi
Chinedu Wilfred Okonkwo
author_sort Lateef Babatunde Amusa
collection DOAJ
description Infodemiologic methods could be used to enhance modeling infectious diseases. It is of interest to verify the utility of these methods using a Nigerian case study. We used Google Trends data to track COVID-19 incidences and assessed whether they could complement traditional data based solely on reported case numbers. Data on the Nigerian weekly COVID-19 cases spanning through March 1, 2020, to May 31, 2021, were matched with internet search data from Google Trends. The reported weekly incidence numbers and the GT data were split into training and testing sets. ARIMA models were fitted to describe reported weekly COVID cases using the training set. Several COVID-related search terms were theoretically and empirically assessed for initial screening. The utilized Google Trends (GT) variable was added to the ARIMA model as a regressor. Model forecasts, both with and without GTD, were compared with weekly cases in the test set over 13 weeks. Forecast accuracies were compared visually and using RMSE (root mean square error) and MAE (mean average error). Statistical significance of the difference in predictions was determined with the two-sided Diebold-Mariano test. Preliminary results of contemporaneous correlations between COVID-related search terms and weekly COVID cases reveal “loss of smell,” “loss of taste,” “fever” (in order of magnitude) as significantly associated with the official cases. Predictions of the ARIMA model using solely reported case numbers resulted in an RMSE (root mean squared error) of 411.4 and mean absolute error (MAE) of 354.9. The GT expanded model achieved better forecasting accuracy (RMSE: 388.7 and MAE = 340.1). Corrected Akaike Information Criteria also favored the GT expanded model (869.4 vs. 872.2). The difference in predictive performances was significant when using a two-sided Diebold-Mariano test (DM = 6.75, p < 0.001) for the 13 weeks. Google trends data enhanced the predictive ability of a traditionally based model and should be considered a suitable method to enhance infectious disease modeling.
first_indexed 2024-04-11T20:35:47Z
format Article
id doaj.art-10d9a50041174a499a7cc23fcd8a1c9a
institution Directory Open Access Journal
issn 2504-0537
language English
last_indexed 2024-04-11T20:35:47Z
publishDate 2022-09-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Research Metrics and Analytics
spelling doaj.art-10d9a50041174a499a7cc23fcd8a1c9a2022-12-22T04:04:22ZengFrontiers Media S.A.Frontiers in Research Metrics and Analytics2504-05372022-09-01710.3389/frma.2022.10039721003972Modeling COVID-19 incidence with Google TrendsLateef Babatunde AmusaHossana TwinomurinziChinedu Wilfred OkonkwoInfodemiologic methods could be used to enhance modeling infectious diseases. It is of interest to verify the utility of these methods using a Nigerian case study. We used Google Trends data to track COVID-19 incidences and assessed whether they could complement traditional data based solely on reported case numbers. Data on the Nigerian weekly COVID-19 cases spanning through March 1, 2020, to May 31, 2021, were matched with internet search data from Google Trends. The reported weekly incidence numbers and the GT data were split into training and testing sets. ARIMA models were fitted to describe reported weekly COVID cases using the training set. Several COVID-related search terms were theoretically and empirically assessed for initial screening. The utilized Google Trends (GT) variable was added to the ARIMA model as a regressor. Model forecasts, both with and without GTD, were compared with weekly cases in the test set over 13 weeks. Forecast accuracies were compared visually and using RMSE (root mean square error) and MAE (mean average error). Statistical significance of the difference in predictions was determined with the two-sided Diebold-Mariano test. Preliminary results of contemporaneous correlations between COVID-related search terms and weekly COVID cases reveal “loss of smell,” “loss of taste,” “fever” (in order of magnitude) as significantly associated with the official cases. Predictions of the ARIMA model using solely reported case numbers resulted in an RMSE (root mean squared error) of 411.4 and mean absolute error (MAE) of 354.9. The GT expanded model achieved better forecasting accuracy (RMSE: 388.7 and MAE = 340.1). Corrected Akaike Information Criteria also favored the GT expanded model (869.4 vs. 872.2). The difference in predictive performances was significant when using a two-sided Diebold-Mariano test (DM = 6.75, p < 0.001) for the 13 weeks. Google trends data enhanced the predictive ability of a traditionally based model and should be considered a suitable method to enhance infectious disease modeling.https://www.frontiersin.org/articles/10.3389/frma.2022.1003972/fullBig DataGoogle TrendsARIMACOVID-19infectious disease modeling
spellingShingle Lateef Babatunde Amusa
Hossana Twinomurinzi
Chinedu Wilfred Okonkwo
Modeling COVID-19 incidence with Google Trends
Frontiers in Research Metrics and Analytics
Big Data
Google Trends
ARIMA
COVID-19
infectious disease modeling
title Modeling COVID-19 incidence with Google Trends
title_full Modeling COVID-19 incidence with Google Trends
title_fullStr Modeling COVID-19 incidence with Google Trends
title_full_unstemmed Modeling COVID-19 incidence with Google Trends
title_short Modeling COVID-19 incidence with Google Trends
title_sort modeling covid 19 incidence with google trends
topic Big Data
Google Trends
ARIMA
COVID-19
infectious disease modeling
url https://www.frontiersin.org/articles/10.3389/frma.2022.1003972/full
work_keys_str_mv AT lateefbabatundeamusa modelingcovid19incidencewithgoogletrends
AT hossanatwinomurinzi modelingcovid19incidencewithgoogletrends
AT chineduwilfredokonkwo modelingcovid19incidencewithgoogletrends