The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions

In this paper, we propose a novel benchmark called the StarCraft Multi-Agent Exploration Challenges(SMAC-Exp), where agents learn to perform multi-stage tasks and to use environmental factors without precise reward functions. The previous challenges (SMAC) recognized as a standard benchmark of Multi...

Full description

Bibliographic Details
Main Authors: Mingyu Kim, Jihwan Oh, Yongsik Lee, Joonkee Kim, Seonghwan Kim, Song Chong, Seyoung Yun
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10099458/
_version_ 1797834797862617088
author Mingyu Kim
Jihwan Oh
Yongsik Lee
Joonkee Kim
Seonghwan Kim
Song Chong
Seyoung Yun
author_facet Mingyu Kim
Jihwan Oh
Yongsik Lee
Joonkee Kim
Seonghwan Kim
Song Chong
Seyoung Yun
author_sort Mingyu Kim
collection DOAJ
description In this paper, we propose a novel benchmark called the StarCraft Multi-Agent Exploration Challenges(SMAC-Exp), where agents learn to perform multi-stage tasks and to use environmental factors without precise reward functions. The previous challenges (SMAC) recognized as a standard benchmark of Multi-Agent Reinforcement Learning are mainly concerned with ensuring that all agents cooperatively eliminate approaching adversaries only through fine manipulation with obvious reward functions. SMAC-Exp, on the other hand, is interested in the exploration capability of MARL algorithms to efficiently learn implicit multi-stage tasks and environmental factors as well as micro-control. This study covers both offensive and defensive scenarios. In the offensive scenarios, agents must learn to first find opponents and then eliminate them. The defensive scenarios require agents to use topographic features. For example, agents need to position themselves behind protective structures to make it harder for enemies to attack. We investigate a total of twelve MARL algorithms under both sequential and parallel episode settings of SMAC-Exp and observe that recent approaches perform well in similar settings to the previous challenge, but we discover that current multi-agent approaches place relatively less emphasis on exploration perspectives. To a limited extent, we observe that an enhanced exploration method has a positive effect on SMAC-Exp, however, there is still a gap that state-of-the-art algorithms cannot resolve the most challenging scenarios of SMAC-Exp. Consequently, we propose a new axis for future research into Multi-Agent Reinforcement Learning studies.
first_indexed 2024-04-09T14:43:12Z
format Article
id doaj.art-30b48c09878440118716ad7bf491d459
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-09T14:43:12Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-30b48c09878440118716ad7bf491d4592023-05-02T23:00:27ZengIEEEIEEE Access2169-35362023-01-0111378543786810.1109/ACCESS.2023.326665210099458The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward FunctionsMingyu Kim0https://orcid.org/0000-0001-5082-7223Jihwan Oh1Yongsik Lee2Joonkee Kim3https://orcid.org/0000-0002-2155-6556Seonghwan Kim4Song Chong5Seyoung Yun6https://orcid.org/0000-0001-6675-5113Kim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Seoul, South KoreaDepartment of Economics and Law, Korea Military Academy, Seoul, South KoreaKim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Seoul, South KoreaKim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Seoul, South KoreaKim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Seoul, South KoreaKim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Seoul, South KoreaKim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Seoul, South KoreaIn this paper, we propose a novel benchmark called the StarCraft Multi-Agent Exploration Challenges(SMAC-Exp), where agents learn to perform multi-stage tasks and to use environmental factors without precise reward functions. The previous challenges (SMAC) recognized as a standard benchmark of Multi-Agent Reinforcement Learning are mainly concerned with ensuring that all agents cooperatively eliminate approaching adversaries only through fine manipulation with obvious reward functions. SMAC-Exp, on the other hand, is interested in the exploration capability of MARL algorithms to efficiently learn implicit multi-stage tasks and environmental factors as well as micro-control. This study covers both offensive and defensive scenarios. In the offensive scenarios, agents must learn to first find opponents and then eliminate them. The defensive scenarios require agents to use topographic features. For example, agents need to position themselves behind protective structures to make it harder for enemies to attack. We investigate a total of twelve MARL algorithms under both sequential and parallel episode settings of SMAC-Exp and observe that recent approaches perform well in similar settings to the previous challenge, but we discover that current multi-agent approaches place relatively less emphasis on exploration perspectives. To a limited extent, we observe that an enhanced exploration method has a positive effect on SMAC-Exp, however, there is still a gap that state-of-the-art algorithms cannot resolve the most challenging scenarios of SMAC-Exp. Consequently, we propose a new axis for future research into Multi-Agent Reinforcement Learning studies.https://ieeexplore.ieee.org/document/10099458/Multi-agent reinforcement learningexplorationbenchmarkStarCraft multi-agent challengemulti-stage task
spellingShingle Mingyu Kim
Jihwan Oh
Yongsik Lee
Joonkee Kim
Seonghwan Kim
Song Chong
Seyoung Yun
The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions
IEEE Access
Multi-agent reinforcement learning
exploration
benchmark
StarCraft multi-agent challenge
multi-stage task
title The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions
title_full The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions
title_fullStr The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions
title_full_unstemmed The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions
title_short The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions
title_sort starcraft multi agent exploration challenges learning multi stage tasks and environmental factors without precise reward functions
topic Multi-agent reinforcement learning
exploration
benchmark
StarCraft multi-agent challenge
multi-stage task
url https://ieeexplore.ieee.org/document/10099458/
work_keys_str_mv AT mingyukim thestarcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT jihwanoh thestarcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT yongsiklee thestarcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT joonkeekim thestarcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT seonghwankim thestarcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT songchong thestarcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT seyoungyun thestarcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT mingyukim starcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT jihwanoh starcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT yongsiklee starcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT joonkeekim starcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT seonghwankim starcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT songchong starcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions
AT seyoungyun starcraftmultiagentexplorationchallengeslearningmultistagetasksandenvironmentalfactorswithoutpreciserewardfunctions