Provably efficient offline reinforcement learning in regular decision processes

This paper deals with offline (or batch) Reinforcement Learning (RL) in episodic Regular Decision Processes (RDPs). RDPs are the subclass of Non-Markov Decision Processes where the dependency on the history of past events can be captured by a finite-state automaton. We consider a setting where the a...

Celý popis

Podrobná bibliografie
Hlavní autoři: Cipollone, R, Jonsson, A, Ronca, A, Talebi, MS
Médium: Conference item
Jazyk:English
Vydáno: Neural Information Processing Systems Foundation 2024