Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance

A result from a standard linear model course is that the variance of the ordinary least squares (OLS) coefficient of a variable will never decrease when including additional covariates into the regression. The variance inflation factor (VIF) measures the increase of the variance. Another result from...

Full description

Bibliographic Details
Main Author: Ding Peng
Format: Article
Language:English
Published: De Gruyter 2021-03-01
Series:Journal of Causal Inference
Subjects:
Online Access:https://doi.org/10.1515/jci-2019-0023
_version_ 1818322847218532352
author Ding Peng
author_facet Ding Peng
author_sort Ding Peng
collection DOAJ
description A result from a standard linear model course is that the variance of the ordinary least squares (OLS) coefficient of a variable will never decrease when including additional covariates into the regression. The variance inflation factor (VIF) measures the increase of the variance. Another result from a standard linear model or experimental design course is that including additional covariates in a linear model of the outcome on the treatment indicator will never increase the variance of the OLS coefficient of the treatment at least asymptotically. This technique is called the analysis of covariance (ANCOVA), which is often used to improve the efficiency of treatment effect estimation. So we have two paradoxical results: adding covariates never decreases the variance in the first result but never increases the variance in the second result. In fact, these two results are derived under different assumptions. More precisely, the VIF result conditions on the treatment indicators but the ANCOVA result averages over them. Comparing the estimators with and without adjusting for additional covariates in a completely randomized experiment, I show that the former has smaller variance averaging over the treatment indicators, and the latter has smaller variance at the cost of a larger bias conditioning on the treatment indicators. Therefore, there is no real paradox.
first_indexed 2024-12-13T11:03:18Z
format Article
id doaj.art-eac6ef5c639d4c43b8583e0b2c95f969
institution Directory Open Access Journal
issn 2193-3677
2193-3685
language English
last_indexed 2024-12-13T11:03:18Z
publishDate 2021-03-01
publisher De Gruyter
record_format Article
series Journal of Causal Inference
spelling doaj.art-eac6ef5c639d4c43b8583e0b2c95f9692022-12-21T23:49:12ZengDe GruyterJournal of Causal Inference2193-36772193-36852021-03-01911810.1515/jci-2019-0023Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covarianceDing Peng0Department of Statistics, University of California, Berkeley, CA94720, United States of AmericaA result from a standard linear model course is that the variance of the ordinary least squares (OLS) coefficient of a variable will never decrease when including additional covariates into the regression. The variance inflation factor (VIF) measures the increase of the variance. Another result from a standard linear model or experimental design course is that including additional covariates in a linear model of the outcome on the treatment indicator will never increase the variance of the OLS coefficient of the treatment at least asymptotically. This technique is called the analysis of covariance (ANCOVA), which is often used to improve the efficiency of treatment effect estimation. So we have two paradoxical results: adding covariates never decreases the variance in the first result but never increases the variance in the second result. In fact, these two results are derived under different assumptions. More precisely, the VIF result conditions on the treatment indicators but the ANCOVA result averages over them. Comparing the estimators with and without adjusting for additional covariates in a completely randomized experiment, I show that the former has smaller variance averaging over the treatment indicators, and the latter has smaller variance at the cost of a larger bias conditioning on the treatment indicators. Therefore, there is no real paradox.https://doi.org/10.1515/jci-2019-0023causal inferenceconditioningdesign-based inferencepotential outcomesrandomizationrerandomization62-0162a0162j10
spellingShingle Ding Peng
Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance
Journal of Causal Inference
causal inference
conditioning
design-based inference
potential outcomes
randomization
rerandomization
62-01
62a01
62j10
title Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance
title_full Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance
title_fullStr Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance
title_full_unstemmed Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance
title_short Two seemingly paradoxical results in linear models: the variance inflation factor and the analysis of covariance
title_sort two seemingly paradoxical results in linear models the variance inflation factor and the analysis of covariance
topic causal inference
conditioning
design-based inference
potential outcomes
randomization
rerandomization
62-01
62a01
62j10
url https://doi.org/10.1515/jci-2019-0023
work_keys_str_mv AT dingpeng twoseeminglyparadoxicalresultsinlinearmodelsthevarianceinflationfactorandtheanalysisofcovariance