Counterfactual off-policy evaluation with gumbel-max structural causal models
We introduce an off-policy evaluation procedure for highlighting episodes where applying a reinforcement learned (RL) policy is likely to have produced a substantially different outcome than the observed policy. In particular, we introduce a class of structural causal models (SCMs) for generating co...
Main Authors: | Oberst, Michael, Sontag, David Alexander |
---|---|
Other Authors: | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
Format: | Article |
Language: | English |
Published: |
MLResearch Press
2021
|
Online Access: | https://hdl.handle.net/1721.1/130437 |
Similar Items
-
Counterfactual policy introspection using structural causal models
by: Oberst, Michael Karl.
Published: (2020) -
A counterfactual simulation model of causal judgments for physical events.
by: Gerstenberg, Tobias, et al.
Published: (2021) -
Melacak distribusi Gumbel
by: Perpustakaan UGM, i-lib
Published: (1984) -
Counterfactual: An R Package for Counterfactual Analysis
by: Chen, Mingli, et al.
Published: (2019) -
Analyses of prior selections for Gumbel distribution
by: Rostami, Mohammad, et al.
Published: (2013)