Targeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health records

Observational causal inference is useful for decision-making in medicine when randomized clinical trials (RCTs) are infeasible or nongeneralizable. However, traditional approaches do not always deliver unconfounded causal conclusions in practice. The rise of “doubly robust” nonparametric tools coupl...

Full description

Bibliographic Details
Main Authors: Rao, S, Mamouei, M, Salimi-Khorshidi, G, Li, Y, Ramakrishnan, R, Canoy, D, Hassaine, A, Rahimi, K
Format: Journal article
Language:English
Published: IEEE 2022
_version_ 1811139481601835008
author Rao, S
Mamouei, M
Salimi-Khorshidi, G
Li, Y
Ramakrishnan, R
Canoy, D
Hassaine, A
Rahimi, K
author_facet Rao, S
Mamouei, M
Salimi-Khorshidi, G
Li, Y
Ramakrishnan, R
Canoy, D
Hassaine, A
Rahimi, K
author_sort Rao, S
collection OXFORD
description Observational causal inference is useful for decision-making in medicine when randomized clinical trials (RCTs) are infeasible or nongeneralizable. However, traditional approaches do not always deliver unconfounded causal conclusions in practice. The rise of “doubly robust” nonparametric tools coupled with the growth of deep learning for capturing rich representations of multimodal data offers a unique opportunity to develop and test such models for causal inference on comprehensive electronic health records (EHRs). In this article, we investigate causal modeling of an RCT-established causal association: the effect of classes of antihypertensive on incident cancer risk. We develop a transformer-based model, targeted bidirectional EHR transformer (T-BEHRT) coupled with doubly robust estimation to estimate average risk ratio (RR). We compare our model to benchmark statistical and deep learning models for causal inference in multiple experiments on semi-synthetic derivations of our dataset with various types and intensities of confounding. In order to further test the reliability of our approach, we test our model on situations of limited data. We find that our model provides more accurate estimates of relative risk least sum absolute error (SAE) from ground truth compared with benchmark estimations. Finally, our model provides an estimate of class-wise antihypertensive effect on cancer risk that is consistent with results derived from RCTs.
first_indexed 2024-03-07T07:16:34Z
format Journal article
id oxford-uuid:42dca2d1-4e89-46de-83af-c30ccf68a63f
institution University of Oxford
language English
last_indexed 2024-09-25T04:06:47Z
publishDate 2022
publisher IEEE
record_format dspace
spelling oxford-uuid:42dca2d1-4e89-46de-83af-c30ccf68a63f2024-06-05T11:43:29ZTargeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health recordsJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:42dca2d1-4e89-46de-83af-c30ccf68a63fEnglishSymplectic ElementsIEEE2022Rao, SMamouei, MSalimi-Khorshidi, GLi, YRamakrishnan, RCanoy, DHassaine, ARahimi, KObservational causal inference is useful for decision-making in medicine when randomized clinical trials (RCTs) are infeasible or nongeneralizable. However, traditional approaches do not always deliver unconfounded causal conclusions in practice. The rise of “doubly robust” nonparametric tools coupled with the growth of deep learning for capturing rich representations of multimodal data offers a unique opportunity to develop and test such models for causal inference on comprehensive electronic health records (EHRs). In this article, we investigate causal modeling of an RCT-established causal association: the effect of classes of antihypertensive on incident cancer risk. We develop a transformer-based model, targeted bidirectional EHR transformer (T-BEHRT) coupled with doubly robust estimation to estimate average risk ratio (RR). We compare our model to benchmark statistical and deep learning models for causal inference in multiple experiments on semi-synthetic derivations of our dataset with various types and intensities of confounding. In order to further test the reliability of our approach, we test our model on situations of limited data. We find that our model provides more accurate estimates of relative risk least sum absolute error (SAE) from ground truth compared with benchmark estimations. Finally, our model provides an estimate of class-wise antihypertensive effect on cancer risk that is consistent with results derived from RCTs.
spellingShingle Rao, S
Mamouei, M
Salimi-Khorshidi, G
Li, Y
Ramakrishnan, R
Canoy, D
Hassaine, A
Rahimi, K
Targeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health records
title Targeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health records
title_full Targeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health records
title_fullStr Targeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health records
title_full_unstemmed Targeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health records
title_short Targeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health records
title_sort targeted behrt deep learning for observational causal inference on longitudinal electronic health records
work_keys_str_mv AT raos targetedbehrtdeeplearningforobservationalcausalinferenceonlongitudinalelectronichealthrecords
AT mamoueim targetedbehrtdeeplearningforobservationalcausalinferenceonlongitudinalelectronichealthrecords
AT salimikhorshidig targetedbehrtdeeplearningforobservationalcausalinferenceonlongitudinalelectronichealthrecords
AT liy targetedbehrtdeeplearningforobservationalcausalinferenceonlongitudinalelectronichealthrecords
AT ramakrishnanr targetedbehrtdeeplearningforobservationalcausalinferenceonlongitudinalelectronichealthrecords
AT canoyd targetedbehrtdeeplearningforobservationalcausalinferenceonlongitudinalelectronichealthrecords
AT hassainea targetedbehrtdeeplearningforobservationalcausalinferenceonlongitudinalelectronichealthrecords
AT rahimik targetedbehrtdeeplearningforobservationalcausalinferenceonlongitudinalelectronichealthrecords