Provably efficient offline reinforcement learning in regular decision processes
This paper deals with offline (or batch) Reinforcement Learning (RL) in episodic Regular Decision Processes (RDPs). RDPs are the subclass of Non-Markov Decision Processes where the dependency on the history of past events can be captured by a finite-state automaton. We consider a setting where the a...
Main Authors: | , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Neural Information Processing Systems Foundation
2024
|