Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning

A wide range of machine-learning-based approaches have been developed in the past decade, increasing our ability to accurately model nonlinear and nonadditive response surfaces. This has improved performance for inferential tasks such as estimating average treatment effects in situations where stand...

Full description

Bibliographic Details
Main Authors: Vincent Dorie, George Perrett, Jennifer L. Hill, Benjamin Goodrich
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/24/12/1782
_version_ 1797459235442786304
author Vincent Dorie
George Perrett
Jennifer L. Hill
Benjamin Goodrich
author_facet Vincent Dorie
George Perrett
Jennifer L. Hill
Benjamin Goodrich
author_sort Vincent Dorie
collection DOAJ
description A wide range of machine-learning-based approaches have been developed in the past decade, increasing our ability to accurately model nonlinear and nonadditive response surfaces. This has improved performance for inferential tasks such as estimating average treatment effects in situations where standard parametric models may not fit the data well. These methods have also shown promise for the related task of identifying heterogeneous treatment effects. However, the estimation of both overall and heterogeneous treatment effects can be hampered when data are structured within groups if we fail to correctly model the dependence between observations. Most machine learning methods do not readily accommodate such structure. This paper introduces a new algorithm, stan4bart, that combines the flexibility of Bayesian Additive Regression Trees (BART) for fitting nonlinear response surfaces with the computational and statistical efficiencies of using Stan for the parametric components of the model. We demonstrate how stan4bart can be used to estimate average, subgroup, and individual-level treatment effects with stronger performance than other flexible approaches that ignore the multilevel structure of the data as well as multilevel approaches that have strict parametric forms.
first_indexed 2024-03-09T16:48:32Z
format Article
id doaj.art-1136357fb60e4960a89e7b0014b5afba
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-09T16:48:32Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-1136357fb60e4960a89e7b0014b5afba2023-11-24T14:42:52ZengMDPI AGEntropy1099-43002022-12-012412178210.3390/e24121782Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine LearningVincent Dorie0George Perrett1Jennifer L. Hill2Benjamin Goodrich3Code for America, San Francisco, CA 94103, USADepartment of Applied Statistics, Social Science, and the Humanities, New York University, New York, NY 10003, USADepartment of Applied Statistics, Social Science, and the Humanities, New York University, New York, NY 10003, USADepartment of Political Science, Columbia University, New York, NY 10025, USAA wide range of machine-learning-based approaches have been developed in the past decade, increasing our ability to accurately model nonlinear and nonadditive response surfaces. This has improved performance for inferential tasks such as estimating average treatment effects in situations where standard parametric models may not fit the data well. These methods have also shown promise for the related task of identifying heterogeneous treatment effects. However, the estimation of both overall and heterogeneous treatment effects can be hampered when data are structured within groups if we fail to correctly model the dependence between observations. Most machine learning methods do not readily accommodate such structure. This paper introduces a new algorithm, stan4bart, that combines the flexibility of Bayesian Additive Regression Trees (BART) for fitting nonlinear response surfaces with the computational and statistical efficiencies of using Stan for the parametric components of the model. We demonstrate how stan4bart can be used to estimate average, subgroup, and individual-level treatment effects with stronger performance than other flexible approaches that ignore the multilevel structure of the data as well as multilevel approaches that have strict parametric forms.https://www.mdpi.com/1099-4300/24/12/1782BARTStancausal inferencemachine learningheterogeneous treatment effectsmultilevel data
spellingShingle Vincent Dorie
George Perrett
Jennifer L. Hill
Benjamin Goodrich
Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning
Entropy
BART
Stan
causal inference
machine learning
heterogeneous treatment effects
multilevel data
title Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning
title_full Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning
title_fullStr Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning
title_full_unstemmed Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning
title_short Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning
title_sort stan and bart for causal inference estimating heterogeneous treatment effects using the power of stan and the flexibility of machine learning
topic BART
Stan
causal inference
machine learning
heterogeneous treatment effects
multilevel data
url https://www.mdpi.com/1099-4300/24/12/1782
work_keys_str_mv AT vincentdorie stanandbartforcausalinferenceestimatingheterogeneoustreatmenteffectsusingthepowerofstanandtheflexibilityofmachinelearning
AT georgeperrett stanandbartforcausalinferenceestimatingheterogeneoustreatmenteffectsusingthepowerofstanandtheflexibilityofmachinelearning
AT jenniferlhill stanandbartforcausalinferenceestimatingheterogeneoustreatmenteffectsusingthepowerofstanandtheflexibilityofmachinelearning
AT benjamingoodrich stanandbartforcausalinferenceestimatingheterogeneoustreatmenteffectsusingthepowerofstanandtheflexibilityofmachinelearning