Scaling Up Q-Learning via Exploiting State–Action Equivalence

Recent success stories in reinforcement learning have demonstrated that leveraging structural properties of the underlying environment is key in devising viable methods capable of solving complex tasks. We study off-policy learning in discounted reinforcement learning, where some equivalence relatio...

Full description

Bibliographic Details
Main Authors: Yunlian Lyu, Aymeric Côme, Yijie Zhang, Mohammad Sadegh Talebi
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/25/4/584