Provably efficient offline reinforcement learning in regular decision processes

This paper deals with offline (or batch) Reinforcement Learning (RL) in episodic Regular Decision Processes (RDPs). RDPs are the subclass of Non-Markov Decision Processes where the dependency on the history of past events can be captured by a finite-state automaton. We consider a setting where the a...

ver descrição completa

Detalhes bibliográficos
Main Authors:	Cipollone, R, Jonsson, A, Ronca, A, Talebi, MS
Formato:	Conference item
Idioma:	English
Publicado em:	Neural Information Processing Systems Foundation 2024

Provably efficient offline reinforcement learning in regular decision processes

Registos relacionados