Implementing action mask in proximal policy optimization (PPO) algorithm
The proximal policy optimization (PPO) algorithm is a promising algorithm in reinforcement learning. In this paper, we propose to add an action mask in the PPO algorithm. The mask indicates whether an action is valid or invalid for each state. Simulation results show that, when compared with the ori...
Main Authors: | Cheng-Yen Tang, Chien-Hung Liu, Woei-Kae Chen, Shingchern D. You |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2020-09-01
|
Series: | ICT Express |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2405959520300746 |
Similar Items
-
A Novel Single-Site Mutation in the Catalytic Domain of Protoporphyrinogen Oxidase IX (PPO) Confers Resistance to PPO-Inhibiting Herbicides
by: Gulab Rangani, et al.
Published: (2019-05-01) -
Intelligent Smart Marine Autonomous Surface Ship Decision System Based on Improved PPO Algorithm
by: Wei Guan, et al.
Published: (2022-07-01) -
Multiple-UAV Reinforcement Learning Algorithm Based on Improved PPO in Ray Framework
by: Guang Zhan, et al.
Published: (2022-07-01) -
Assessment of Efficacy and Mechanism of Resistance to Soil-Applied PPO Inhibitors in <i>Amaranthus palmeri</i>
by: Gulab Rangani, et al.
Published: (2023-02-01) -
Exploring the Use of Invalid Action Masking in Reinforcement Learning: A Comparative Study of On-Policy and Off-Policy Algorithms in Real-Time Strategy Games
by: Yueqi Hou, et al.
Published: (2023-07-01)