Summary: | In the last few years, deep multi-agent reinforcement learning (RL)
has become a highly active area of research. A particularly challenging class of problems in this area is partially observable, cooperative,
multi-agent learning, in which teams of agents must learn to coordinate their behaviour while conditioning only on their private
observations. This is an attractive research area since such problems are relevant to a large number of real-world systems and are
also more amenable to evaluation than general-sum problems.
Standardised environments such as the ALE and MuJoCo have
allowed single-agent RL to move beyond toy domains, such as grid
worlds. However, there is no comparable benchmark for cooperative multi-agent RL. As a result, most papers in this field use one-off
toy problems, making it difficult to measure real progress. In this
paper, we propose the StarCraft Multi-Agent Challenge (SMAC)
as a benchmark problem to fill this gap.1 SMAC is based on the
popular real-time strategy game StarCraft II and focuses on micromanagement challenges where each unit is controlled by an
independent agent that must act based on local observations. We
offer a diverse set of challenge maps and recommendations for best
practices in benchmarking and evaluations. We also open-source
a deep multi-agent RL learning framework including state-of-theart algorithms.2 We believe that SMAC can provide a standard
benchmark environment for years to come.
Videos of our best agents for several SMAC scenarios are available at: https://youtu.be/VZ7zmQ_obZ0.
|