Greedy Action Selection and Pessimistic Q-Value Updating in Multi-Agent Reinforcement Learning with Sparse Interaction

Although multi-agent reinforcement learning (MARL) is a promising method for learning a collaborative action policy, enabling each agent to accomplish specified tasks, MARL has a problem of exponentially increasing state-action space. This state-action space can be dramatically reduced by assuming s...

Full description

Bibliographic Details
Main Authors: Toshihiro Kujirai, Takayoshi Yokota
Format: Article
Language:English
Published: Taylor & Francis Group 2019-05-01
Series:SICE Journal of Control, Measurement, and System Integration
Subjects:
Online Access:http://dx.doi.org/10.9746/jcmsi.12.76