Conditionally elicitable dynamic risk measures for deep reinforcement learning

We propose a novel framework to solve risk-sensitive reinforcement learning problems where the agent optimizes time-consistent dynamic spectral risk measures. Based on the notion of conditional elicitability, our methodology constructs (strictly consistent) scoring functions that are used as penaliz...

Full description

Bibliographic Details
Main Authors: Coache, A, Jaimungal, S, Cartea, Á
Format: Journal article
Language:English
Published: Society for Industrial and Applied Mathematics 2023