Provably efficient offline reinforcement learning in regular decision processes

This paper deals with offline (or batch) Reinforcement Learning (RL) in episodic Regular Decision Processes (RDPs). RDPs are the subclass of Non-Markov Decision Processes where the dependency on the history of past events can be captured by a finite-state automaton. We consider a setting where the a...

Full description

Bibliographic Details
Main Authors: Cipollone, R, Jonsson, A, Ronca, A, Talebi, MS
Format: Conference item
Language:English
Published: Neural Information Processing Systems Foundation 2024