Information Bottleneck for Estimating Treatment Effects with Systematically Missing Covariates

Estimating the effects of an intervention from high-dimensional observational data is a challenging problem due to the existence of confounding. The task is often further complicated in healthcare applications where a set of observations may be entirely missing for certain patients at test time, the...

Full description

Bibliographic Details
Main Authors: Sonali Parbhoo, Mario Wieser, Aleksander Wieczorek, Volker Roth
Format: Article
Language:English
Published: MDPI AG 2020-03-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/22/4/389
Description
Summary:Estimating the effects of an intervention from high-dimensional observational data is a challenging problem due to the existence of confounding. The task is often further complicated in healthcare applications where a set of observations may be entirely missing for certain patients at test time, thereby prohibiting accurate inference. In this paper, we address this issue using an approach based on the information bottleneck to reason about the effects of interventions. To this end, we first train an information bottleneck to perform a low-dimensional compression of covariates by explicitly considering the relevance of information for treatment effects. As a second step, we subsequently use the compressed covariates to perform a transfer of relevant information to cases where data are missing during testing. In doing so, we can reliably and accurately estimate treatment effects even in the absence of a full set of covariate information at test time. Our results on two causal inference benchmarks and a real application for treating sepsis show that our method achieves state-of-the-art performance, without compromising interpretability.
ISSN:1099-4300