Counterfactual off-policy evaluation with gumbel-max structural causal models
We introduce an off-policy evaluation procedure for highlighting episodes where applying a reinforcement learned (RL) policy is likely to have produced a substantially different outcome than the observed policy. In particular, we introduce a class of structural causal models (SCMs) for generating co...
Main Authors: | Oberst, Michael, Sontag, David Alexander |
---|---|
Other Authors: | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
Format: | Article |
Language: | English |
Published: |
MLResearch Press
2021
|
Online Access: | https://hdl.handle.net/1721.1/130437 |
Similar Items
-
Counterfactual policy introspection using structural causal models
by: Oberst, Michael Karl.
Published: (2020) -
A counterfactual simulation model of causal judgments for physical events.
by: Gerstenberg, Tobias, et al.
Published: (2021) -
Melacak distribusi Gumbel
by: Perpustakaan UGM, i-lib
Published: (1984) -
Causal counterfactual for the attribution of weather and climate-related events
by: Hannart, A, et al.
Published: (2016) -
Causality, black holes, prediction, and counterfactuals in general relativity
by: Lesourd, M
Published: (2019)