Provably efficient offline reinforcement learning in regular decision processes

This paper deals with offline (or batch) Reinforcement Learning (RL) in episodic Regular Decision Processes (RDPs). RDPs are the subclass of Non-Markov Decision Processes where the dependency on the history of past events can be captured by a finite-state automaton. We consider a setting where the a...

全面介紹

書目詳細資料
Main Authors: Cipollone, R, Jonsson, A, Ronca, A, Talebi, MS
格式: Conference item
語言:English
出版: Neural Information Processing Systems Foundation 2024