Learning State-Specific Action Masks for Reinforcement Learning

Efficient yet sufficient exploration remains a critical challenge in reinforcement learning (RL), especially for Markov Decision Processes (MDPs) with vast action spaces. Previous approaches have commonly involved projecting the original action space into a latent space or employing environmental ac...

Full description

Bibliographic Details
Main Authors: Ziyi Wang, Xinran Li, Luoyang Sun, Haifeng Zhang, Hualin Liu, Jun Wang
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/17/2/60