Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns
Background: Timely allocation of medical resources for coronavirus disease (COVID-19) requires early detection of regional outbreaks. Internet browsing data may predict case outbreaks in local populations that are yet to be confirmed. Objective: We investigated whether search-engine query patterns...
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Published: |
JMIR Publications Inc.
2021
|
Online Access: | https://hdl.handle.net/1721.1/128949 |
_version_ | 1811077683443924992 |
---|---|
author | Cousins, Henry C Cousins, Clara C Harris, Alon Pasquale, Louis R |
author2 | Massachusetts Institute of Technology. Department of Biological Engineering |
author_facet | Massachusetts Institute of Technology. Department of Biological Engineering Cousins, Henry C Cousins, Clara C Harris, Alon Pasquale, Louis R |
author_sort | Cousins, Henry C |
collection | MIT |
description | Background: Timely allocation of medical resources for coronavirus disease (COVID-19) requires early detection of regional outbreaks. Internet browsing data may predict case outbreaks in local populations that are yet to be confirmed.
Objective: We investigated whether search-engine query patterns can help to predict COVID-19 case rates at the state and metropolitan area levels in the United States.
Methods: We used regional confirmed case data from the New York Times and Google Trends results from 50 states and 166 county-based designated market areas (DMA). We identified search terms whose activity precedes and correlates with confirmed case rates at the national level. We used univariate regression to construct a composite explanatory variable based on best-fitting search queries offset by temporal lags. We measured the raw and z-transformed Pearson correlation and root-mean-square error (RMSE) of the explanatory variable with out-of-sample case rate data at the state and DMA levels.
Results: Predictions were highly correlated with confirmed case rates at the state (mean r=0.69, 95% CI 0.51-0.81; median RMSE 1.27, IQR 1.48) and DMA levels (mean r=0.51, 95% CI 0.39-0.61; median RMSE 4.38, IQR 1.80), using search data available up to 10 days prior to confirmed case rates. They fit case-rate activity in 49 of 50 states and in 103 of 166 DMA at a significance level of .05.
Conclusions: Identifiable patterns in search query activity may help to predict emerging regional outbreaks of COVID-19, although they remain vulnerable to stochastic changes in search intensity. |
first_indexed | 2024-09-23T10:46:54Z |
format | Article |
id | mit-1721.1/128949 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T10:46:54Z |
publishDate | 2021 |
publisher | JMIR Publications Inc. |
record_format | dspace |
spelling | mit-1721.1/1289492022-09-30T22:59:29Z Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns Cousins, Henry C Cousins, Clara C Harris, Alon Pasquale, Louis R Massachusetts Institute of Technology. Department of Biological Engineering Background: Timely allocation of medical resources for coronavirus disease (COVID-19) requires early detection of regional outbreaks. Internet browsing data may predict case outbreaks in local populations that are yet to be confirmed. Objective: We investigated whether search-engine query patterns can help to predict COVID-19 case rates at the state and metropolitan area levels in the United States. Methods: We used regional confirmed case data from the New York Times and Google Trends results from 50 states and 166 county-based designated market areas (DMA). We identified search terms whose activity precedes and correlates with confirmed case rates at the national level. We used univariate regression to construct a composite explanatory variable based on best-fitting search queries offset by temporal lags. We measured the raw and z-transformed Pearson correlation and root-mean-square error (RMSE) of the explanatory variable with out-of-sample case rate data at the state and DMA levels. Results: Predictions were highly correlated with confirmed case rates at the state (mean r=0.69, 95% CI 0.51-0.81; median RMSE 1.27, IQR 1.48) and DMA levels (mean r=0.51, 95% CI 0.39-0.61; median RMSE 4.38, IQR 1.80), using search data available up to 10 days prior to confirmed case rates. They fit case-rate activity in 49 of 50 states and in 103 of 166 DMA at a significance level of .05. Conclusions: Identifiable patterns in search query activity may help to predict emerging regional outbreaks of COVID-19, although they remain vulnerable to stochastic changes in search intensity. 2021-01-04T21:39:15Z 2021-01-04T21:39:15Z 2020-07 2020-07 Article http://purl.org/eprint/type/JournalArticle 1438-8871 https://hdl.handle.net/1721.1/128949 Cousins, Henry C. et al. "Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns." Journal of Medical Internet Research 22, 7 (July 2020): e19483. © 2020 The Authors http://dx.doi.org/10.2196/19483 Journal of Medical Internet Research Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf JMIR Publications Inc. Journal of Medical Internet Research (JMIR) |
spellingShingle | Cousins, Henry C Cousins, Clara C Harris, Alon Pasquale, Louis R Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns |
title | Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns |
title_full | Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns |
title_fullStr | Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns |
title_full_unstemmed | Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns |
title_short | Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns |
title_sort | regional infoveillance of covid 19 case rates analysis of search engine query patterns |
url | https://hdl.handle.net/1721.1/128949 |
work_keys_str_mv | AT cousinshenryc regionalinfoveillanceofcovid19caseratesanalysisofsearchenginequerypatterns AT cousinsclarac regionalinfoveillanceofcovid19caseratesanalysisofsearchenginequerypatterns AT harrisalon regionalinfoveillanceofcovid19caseratesanalysisofsearchenginequerypatterns AT pasqualelouisr regionalinfoveillanceofcovid19caseratesanalysisofsearchenginequerypatterns |