COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect data

Abstract Background The COVID-19 pandemic has led to an avalanche of scientific studies, drawing on many different types of data. However, studies addressing the effectiveness of government actions against COVID-19, especially non-pharmaceutical interventions, often exhibit data problems that threat...

Full description

Bibliographic Details
Main Authors: Michael A. Stoto, Abbey Woolverton, John Kraemer, Pepita Barlow, Michael Clarke
Format: Article
Language:English
Published: BMC 2022-01-01
Series:Globalization and Health
Subjects:
Online Access:https://doi.org/10.1186/s12992-021-00795-0
_version_ 1818974316688048128
author Michael A. Stoto
Abbey Woolverton
John Kraemer
Pepita Barlow
Michael Clarke
author_facet Michael A. Stoto
Abbey Woolverton
John Kraemer
Pepita Barlow
Michael Clarke
author_sort Michael A. Stoto
collection DOAJ
description Abstract Background The COVID-19 pandemic has led to an avalanche of scientific studies, drawing on many different types of data. However, studies addressing the effectiveness of government actions against COVID-19, especially non-pharmaceutical interventions, often exhibit data problems that threaten the validity of their results. This review is thus intended to help epidemiologists and other researchers identify a set of data issues that, in our view, must be addressed in order for their work to be credible. We further intend to help journal editors and peer reviewers when evaluating studies, to apprise policy-makers, journalists, and other research consumers about the strengths and weaknesses of published studies, and to inform the wider debate about the scientific quality of COVID-19 research. Results To this end, we describe common challenges in the collection, reporting, and use of epidemiologic, policy, and other data, including completeness and representativeness of outcomes data; their comparability over time and among jurisdictions; the adequacy of policy variables and data on intermediate outcomes such as mobility and mask use; and a mismatch between level of intervention and outcome variables. We urge researchers to think critically about potential problems with the COVID-19 data sources over the specific time periods and particular locations they have chosen to analyze, and to choose not only appropriate study designs but also to conduct appropriate checks and sensitivity analyses to investigate the impact(s) of potential threats on study findings. Conclusions In an effort to encourage high quality research, we provide recommendations on how to address the issues we identify. Our first recommendation is for researchers to choose an appropriate design (and the data it requires). This review describes considerations and issues in order to identify the strongest analytical designs and demonstrates how interrupted time-series and comparative longitudinal studies can be particularly useful. Furthermore, we recommend that researchers conduct checks or sensitivity analyses of the results to data source and design choices, which we illustrate. Regardless of the approaches taken, researchers should be explicit about the kind of data problems or other biases that the design choice and sensitivity analyses are addressing.
first_indexed 2024-12-20T15:38:08Z
format Article
id doaj.art-b9aecb5b3964499c9adb3e80df7c7774
institution Directory Open Access Journal
issn 1744-8603
language English
last_indexed 2024-12-20T15:38:08Z
publishDate 2022-01-01
publisher BMC
record_format Article
series Globalization and Health
spelling doaj.art-b9aecb5b3964499c9adb3e80df7c77742022-12-21T19:35:20ZengBMCGlobalization and Health1744-86032022-01-011811810.1186/s12992-021-00795-0COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect dataMichael A. Stoto0Abbey Woolverton1John Kraemer2Pepita Barlow3Michael Clarke4Georgetown University and Harvard T.H. Chan School of Public HealthGeorgetown UniversityGeorgetown UniversityLondon School of Economics and Political ScienceWestern UniversityAbstract Background The COVID-19 pandemic has led to an avalanche of scientific studies, drawing on many different types of data. However, studies addressing the effectiveness of government actions against COVID-19, especially non-pharmaceutical interventions, often exhibit data problems that threaten the validity of their results. This review is thus intended to help epidemiologists and other researchers identify a set of data issues that, in our view, must be addressed in order for their work to be credible. We further intend to help journal editors and peer reviewers when evaluating studies, to apprise policy-makers, journalists, and other research consumers about the strengths and weaknesses of published studies, and to inform the wider debate about the scientific quality of COVID-19 research. Results To this end, we describe common challenges in the collection, reporting, and use of epidemiologic, policy, and other data, including completeness and representativeness of outcomes data; their comparability over time and among jurisdictions; the adequacy of policy variables and data on intermediate outcomes such as mobility and mask use; and a mismatch between level of intervention and outcome variables. We urge researchers to think critically about potential problems with the COVID-19 data sources over the specific time periods and particular locations they have chosen to analyze, and to choose not only appropriate study designs but also to conduct appropriate checks and sensitivity analyses to investigate the impact(s) of potential threats on study findings. Conclusions In an effort to encourage high quality research, we provide recommendations on how to address the issues we identify. Our first recommendation is for researchers to choose an appropriate design (and the data it requires). This review describes considerations and issues in order to identify the strongest analytical designs and demonstrates how interrupted time-series and comparative longitudinal studies can be particularly useful. Furthermore, we recommend that researchers conduct checks or sensitivity analyses of the results to data source and design choices, which we illustrate. Regardless of the approaches taken, researchers should be explicit about the kind of data problems or other biases that the design choice and sensitivity analyses are addressing.https://doi.org/10.1186/s12992-021-00795-0COVID-19Non-pharmaceutical interventionsSurveillance dataSurveillance biasesImpact analysisObservational studies
spellingShingle Michael A. Stoto
Abbey Woolverton
John Kraemer
Pepita Barlow
Michael Clarke
COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect data
Globalization and Health
COVID-19
Non-pharmaceutical interventions
Surveillance data
Surveillance biases
Impact analysis
Observational studies
title COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect data
title_full COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect data
title_fullStr COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect data
title_full_unstemmed COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect data
title_short COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect data
title_sort covid 19 data are messy analytic methods for rigorous impact analyses with imperfect data
topic COVID-19
Non-pharmaceutical interventions
Surveillance data
Surveillance biases
Impact analysis
Observational studies
url https://doi.org/10.1186/s12992-021-00795-0
work_keys_str_mv AT michaelastoto covid19dataaremessyanalyticmethodsforrigorousimpactanalyseswithimperfectdata
AT abbeywoolverton covid19dataaremessyanalyticmethodsforrigorousimpactanalyseswithimperfectdata
AT johnkraemer covid19dataaremessyanalyticmethodsforrigorousimpactanalyseswithimperfectdata
AT pepitabarlow covid19dataaremessyanalyticmethodsforrigorousimpactanalyseswithimperfectdata
AT michaelclarke covid19dataaremessyanalyticmethodsforrigorousimpactanalyseswithimperfectdata