Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance
A result from a standard linear model course is that the variance of the ordinary least squares (OLS) coefficient of a variable will never decrease when including additional covariates into the regression. The variance inflation factor (VIF) measures the increase of the variance. Another result from...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
De Gruyter
2021-03-01
|
Series: | Journal of Causal Inference |
Subjects: | |
Online Access: | https://doi.org/10.1515/jci-2019-0023 |
_version_ | 1818322847218532352 |
---|---|
author | Ding Peng |
author_facet | Ding Peng |
author_sort | Ding Peng |
collection | DOAJ |
description | A result from a standard linear model course is that the variance of the ordinary least squares (OLS) coefficient of a variable will never decrease when including additional covariates into the regression. The variance inflation factor (VIF) measures the increase of the variance. Another result from a standard linear model or experimental design course is that including additional covariates in a linear model of the outcome on the treatment indicator will never increase the variance of the OLS coefficient of the treatment at least asymptotically. This technique is called the analysis of covariance (ANCOVA), which is often used to improve the efficiency of treatment effect estimation. So we have two paradoxical results: adding covariates never decreases the variance in the first result but never increases the variance in the second result. In fact, these two results are derived under different assumptions. More precisely, the VIF result conditions on the treatment indicators but the ANCOVA result averages over them. Comparing the estimators with and without adjusting for additional covariates in a completely randomized experiment, I show that the former has smaller variance averaging over the treatment indicators, and the latter has smaller variance at the cost of a larger bias conditioning on the treatment indicators. Therefore, there is no real paradox. |
first_indexed | 2024-12-13T11:03:18Z |
format | Article |
id | doaj.art-eac6ef5c639d4c43b8583e0b2c95f969 |
institution | Directory Open Access Journal |
issn | 2193-3677 2193-3685 |
language | English |
last_indexed | 2024-12-13T11:03:18Z |
publishDate | 2021-03-01 |
publisher | De Gruyter |
record_format | Article |
series | Journal of Causal Inference |
spelling | doaj.art-eac6ef5c639d4c43b8583e0b2c95f9692022-12-21T23:49:12ZengDe GruyterJournal of Causal Inference2193-36772193-36852021-03-01911810.1515/jci-2019-0023Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covarianceDing Peng0Department of Statistics, University of California, Berkeley, CA94720, United States of AmericaA result from a standard linear model course is that the variance of the ordinary least squares (OLS) coefficient of a variable will never decrease when including additional covariates into the regression. The variance inflation factor (VIF) measures the increase of the variance. Another result from a standard linear model or experimental design course is that including additional covariates in a linear model of the outcome on the treatment indicator will never increase the variance of the OLS coefficient of the treatment at least asymptotically. This technique is called the analysis of covariance (ANCOVA), which is often used to improve the efficiency of treatment effect estimation. So we have two paradoxical results: adding covariates never decreases the variance in the first result but never increases the variance in the second result. In fact, these two results are derived under different assumptions. More precisely, the VIF result conditions on the treatment indicators but the ANCOVA result averages over them. Comparing the estimators with and without adjusting for additional covariates in a completely randomized experiment, I show that the former has smaller variance averaging over the treatment indicators, and the latter has smaller variance at the cost of a larger bias conditioning on the treatment indicators. Therefore, there is no real paradox.https://doi.org/10.1515/jci-2019-0023causal inferenceconditioningdesign-based inferencepotential outcomesrandomizationrerandomization62-0162a0162j10 |
spellingShingle | Ding Peng Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance Journal of Causal Inference causal inference conditioning design-based inference potential outcomes randomization rerandomization 62-01 62a01 62j10 |
title | Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance |
title_full | Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance |
title_fullStr | Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance |
title_full_unstemmed | Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance |
title_short | Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance |
title_sort | two seemingly paradoxical results in linear models the variance inflation factor and the analysis of covariance |
topic | causal inference conditioning design-based inference potential outcomes randomization rerandomization 62-01 62a01 62j10 |
url | https://doi.org/10.1515/jci-2019-0023 |
work_keys_str_mv | AT dingpeng twoseeminglyparadoxicalresultsinlinearmodelsthevarianceinflationfactorandtheanalysisofcovariance |