Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis

© 2020 The Econometric Society Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors? In practice, researcher...

Full description

Bibliographic Details
Main Authors: Abadie, Alberto, Athey, Susan, Imbens, Guido W, Wooldridge, Jeffrey M
Other Authors: Massachusetts Institute of Technology. Department of Economics
Format: Article
Language:English
Published: The Econometric Society 2021
Online Access:https://hdl.handle.net/1721.1/136224
_version_ 1826205764859985920
author Abadie, Alberto
Athey, Susan
Imbens, Guido W
Wooldridge, Jeffrey M
author2 Massachusetts Institute of Technology. Department of Economics
author_facet Massachusetts Institute of Technology. Department of Economics
Abadie, Alberto
Athey, Susan
Imbens, Guido W
Wooldridge, Jeffrey M
author_sort Abadie, Alberto
collection MIT
description © 2020 The Econometric Society Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors? In practice, researchers typically assume that the sample is randomly drawn from a large population of interest and report standard errors that are designed to capture sampling variation. This is common even in applications where it is difficult to articulate what that population of interest is, and how it differs from the sample. In this article, we explore an alternative approach to inference, which is partly design-based. In a design-based setting, the values of some of the regressors can be manipulated, perhaps through a policy intervention. Design-based uncertainty emanates from lack of knowledge about the values that the regression outcome would have taken under alternative interventions. We derive standard errors that account for design-based uncertainty instead of, or in addition to, sampling-based uncertainty. We show that our standard errors in general are smaller than the usual infinite-population sampling-based standard errors and provide conditions under which they coincide.
first_indexed 2024-09-23T13:18:40Z
format Article
id mit-1721.1/136224
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T13:18:40Z
publishDate 2021
publisher The Econometric Society
record_format dspace
spelling mit-1721.1/1362242023-09-14T19:46:24Z Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis Abadie, Alberto Athey, Susan Imbens, Guido W Wooldridge, Jeffrey M Massachusetts Institute of Technology. Department of Economics © 2020 The Econometric Society Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors? In practice, researchers typically assume that the sample is randomly drawn from a large population of interest and report standard errors that are designed to capture sampling variation. This is common even in applications where it is difficult to articulate what that population of interest is, and how it differs from the sample. In this article, we explore an alternative approach to inference, which is partly design-based. In a design-based setting, the values of some of the regressors can be manipulated, perhaps through a policy intervention. Design-based uncertainty emanates from lack of knowledge about the values that the regression outcome would have taken under alternative interventions. We derive standard errors that account for design-based uncertainty instead of, or in addition to, sampling-based uncertainty. We show that our standard errors in general are smaller than the usual infinite-population sampling-based standard errors and provide conditions under which they coincide. 2021-10-27T20:34:20Z 2021-10-27T20:34:20Z 2020 2021-03-25T16:41:04Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/136224 en 10.3982/ECTA12675 Econometrica Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf The Econometric Society arXiv
spellingShingle Abadie, Alberto
Athey, Susan
Imbens, Guido W
Wooldridge, Jeffrey M
Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis
title Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis
title_full Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis
title_fullStr Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis
title_full_unstemmed Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis
title_short Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis
title_sort sampling based versus design based uncertainty in regression analysis
url https://hdl.handle.net/1721.1/136224
work_keys_str_mv AT abadiealberto samplingbasedversusdesignbaseduncertaintyinregressionanalysis
AT atheysusan samplingbasedversusdesignbaseduncertaintyinregressionanalysis
AT imbensguidow samplingbasedversusdesignbaseduncertaintyinregressionanalysis
AT wooldridgejeffreym samplingbasedversusdesignbaseduncertaintyinregressionanalysis