Causal effect on a target population: A sensitivity analysis to handle missing covariates

Randomized controlled trials (RCTs) are often considered the gold standard for estimating causal effect, but they may lack external validity when the population eligible to the RCT is substantially different from the target population. Having at hand a sample of the target population of interest all...

Full description

Bibliographic Details
Main Authors: Colnet Bénédicte, Josse Julie, Varoquaux Gaël, Scornet Erwan
Format: Article
Language:English
Published: De Gruyter 2022-11-01
Series:Journal of Causal Inference
Subjects:
Online Access:https://doi.org/10.1515/jci-2021-0059
_version_ 1811320169652289536
author Colnet Bénédicte
Josse Julie
Varoquaux Gaël
Scornet Erwan
author_facet Colnet Bénédicte
Josse Julie
Varoquaux Gaël
Scornet Erwan
author_sort Colnet Bénédicte
collection DOAJ
description Randomized controlled trials (RCTs) are often considered the gold standard for estimating causal effect, but they may lack external validity when the population eligible to the RCT is substantially different from the target population. Having at hand a sample of the target population of interest allows us to generalize the causal effect. Identifying the treatment effect in the target population requires covariates to capture all treatment effect modifiers that are shifted between the two sets. Standard estimators then use either weighting (IPSW), outcome modeling (G-formula), or combine the two in doubly robust approaches (AIPSW). However, such covariates are often not available in both sets. In this article, after proving L1{L}^{1}-consistency of these three estimators, we compute the expected bias induced by a missing covariate, assuming a Gaussian distribution, a continuous outcome, and a semi-parametric model. Under this setting, we perform a sensitivity analysis for each missing covariate pattern and compute the sign of the expected bias. We also show that there is no gain in linearly imputing a partially unobserved covariate. Finally, we study the substitution of a missing covariate by a proxy. We illustrate all these results on simulations, as well as semi-synthetic benchmarks using data from the Tennessee student/teacher achievement ratio (STAR), and a real-world example from critical care medicine.
first_indexed 2024-04-13T12:55:27Z
format Article
id doaj.art-0f9459d60c954d618b6f35f39ac277c8
institution Directory Open Access Journal
issn 2193-3685
language English
last_indexed 2024-04-13T12:55:27Z
publishDate 2022-11-01
publisher De Gruyter
record_format Article
series Journal of Causal Inference
spelling doaj.art-0f9459d60c954d618b6f35f39ac277c82022-12-22T02:46:05ZengDe GruyterJournal of Causal Inference2193-36852022-11-0110137241410.1515/jci-2021-0059Causal effect on a target population: A sensitivity analysis to handle missing covariatesColnet Bénédicte0Josse Julie1Varoquaux Gaël2Scornet Erwan3Soda Project-team, Premedical Project-team, INRIA, and Centre de Mathémathiques Appliquées, Institut Polytechnique de Paris, Palaiseau, FrancePremedical Project Team, INRIA Sophia-Antipolis, Montpellier, FranceSoda Project-team, INRIA Saclay, FranceCentre de Mathémathiques Appliquées, UMR 7641, École Polytechnique, CNRS, Institut Polytechnique de Paris, Palaiseau, FranceRandomized controlled trials (RCTs) are often considered the gold standard for estimating causal effect, but they may lack external validity when the population eligible to the RCT is substantially different from the target population. Having at hand a sample of the target population of interest allows us to generalize the causal effect. Identifying the treatment effect in the target population requires covariates to capture all treatment effect modifiers that are shifted between the two sets. Standard estimators then use either weighting (IPSW), outcome modeling (G-formula), or combine the two in doubly robust approaches (AIPSW). However, such covariates are often not available in both sets. In this article, after proving L1{L}^{1}-consistency of these three estimators, we compute the expected bias induced by a missing covariate, assuming a Gaussian distribution, a continuous outcome, and a semi-parametric model. Under this setting, we perform a sensitivity analysis for each missing covariate pattern and compute the sign of the expected bias. We also show that there is no gain in linearly imputing a partially unobserved covariate. Finally, we study the substitution of a missing covariate by a proxy. We illustrate all these results on simulations, as well as semi-synthetic benchmarks using data from the Tennessee student/teacher achievement ratio (STAR), and a real-world example from critical care medicine.https://doi.org/10.1515/jci-2021-0059average treatment effectdistributional shiftexternal validitygeneralizabilitytransportabilityprimary 62f1293c4162g35secondary 62p1062p25
spellingShingle Colnet Bénédicte
Josse Julie
Varoquaux Gaël
Scornet Erwan
Causal effect on a target population: A sensitivity analysis to handle missing covariates
Journal of Causal Inference
average treatment effect
distributional shift
external validity
generalizability
transportability
primary 62f12
93c41
62g35
secondary 62p10
62p25
title Causal effect on a target population: A sensitivity analysis to handle missing covariates
title_full Causal effect on a target population: A sensitivity analysis to handle missing covariates
title_fullStr Causal effect on a target population: A sensitivity analysis to handle missing covariates
title_full_unstemmed Causal effect on a target population: A sensitivity analysis to handle missing covariates
title_short Causal effect on a target population: A sensitivity analysis to handle missing covariates
title_sort causal effect on a target population a sensitivity analysis to handle missing covariates
topic average treatment effect
distributional shift
external validity
generalizability
transportability
primary 62f12
93c41
62g35
secondary 62p10
62p25
url https://doi.org/10.1515/jci-2021-0059
work_keys_str_mv AT colnetbenedicte causaleffectonatargetpopulationasensitivityanalysistohandlemissingcovariates
AT jossejulie causaleffectonatargetpopulationasensitivityanalysistohandlemissingcovariates
AT varoquauxgael causaleffectonatargetpopulationasensitivityanalysistohandlemissingcovariates
AT scorneterwan causaleffectonatargetpopulationasensitivityanalysistohandlemissingcovariates