Implementing action mask in proximal policy optimization (PPO) algorithm
The proximal policy optimization (PPO) algorithm is a promising algorithm in reinforcement learning. In this paper, we propose to add an action mask in the PPO algorithm. The mask indicates whether an action is valid or invalid for each state. Simulation results show that, when compared with the ori...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2020-09-01
|
Series: | ICT Express |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2405959520300746 |