CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning

In this paper, we propose a novel curiosity-based learning algorithm for Multi-agent Reinforcement Learning (MARL) to attain efficient and effective decision-making. We employ the centralized training with decentralized execution framework (CTDE) and consider that each agent has knowledge of the pri...

Full description

Bibliographic Details
Main Authors: Devarani Devi Ningombam, Byunghyun Yoo, Hyun Woo Kim, Hwa Jeon Song, Sungwon Yi
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9857920/
_version_ 1798036540025208832
author Devarani Devi Ningombam
Byunghyun Yoo
Hyun Woo Kim
Hwa Jeon Song
Sungwon Yi
author_facet Devarani Devi Ningombam
Byunghyun Yoo
Hyun Woo Kim
Hwa Jeon Song
Sungwon Yi
author_sort Devarani Devi Ningombam
collection DOAJ
description In this paper, we propose a novel curiosity-based learning algorithm for Multi-agent Reinforcement Learning (MARL) to attain efficient and effective decision-making. We employ the centralized training with decentralized execution framework (CTDE) and consider that each agent has knowledge of the prior action distribution of others. To quantify the difference in agents’ knowledge, curiosity, we introduce conditional mutual information (CMI) regularization and use the amount of information for updating decision-making policy. Then, to deploy these learning frameworks in a large-scale MARL setting while acquiring high sample efficiency, we consider a Kullback-Leibler (KL) divergence-based prioritization of experiences. We evaluate the effectiveness of the proposed algorithm in three different levels of StarCraft Multi Agent Challenge (SMAC) scenarios using the PyMARL framework. The simulation-based performance analysis shows that the proposed technique significantly improves the test win rate compared to various state-of-the-art MARL benchmarks, such as the Optimistically Weighted Monotonic Value Function Factorization (OW_QMIX) and Learning Individual Intrinsic Reward (LIIR).
first_indexed 2024-04-11T21:14:20Z
format Article
id doaj.art-ea341c98ba52493b8fe43c46f77bd8a6
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T21:14:20Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-ea341c98ba52493b8fe43c46f77bd8a62022-12-22T04:02:53ZengIEEEIEEE Access2169-35362022-01-0110872548726510.1109/ACCESS.2022.31989819857920CuMARL: Curiosity-Based Learning in Multiagent Reinforcement LearningDevarani Devi Ningombam0https://orcid.org/0000-0002-6946-6584Byunghyun Yoo1https://orcid.org/0000-0003-0857-5565Hyun Woo Kim2Hwa Jeon Song3https://orcid.org/0000-0002-8216-4812Sungwon Yi4https://orcid.org/0000-0002-4986-9546Department of Informatics, School of Computer Science, University of Petroleum and Energy Studies (UPES), Dehradun, Uttarakhand, IndiaElectronics and Telecommunications Research Institute (ETRI), Daejeon, South KoreaElectronics and Telecommunications Research Institute (ETRI), Daejeon, South KoreaElectronics and Telecommunications Research Institute (ETRI), Daejeon, South KoreaElectronics and Telecommunications Research Institute (ETRI), Daejeon, South KoreaIn this paper, we propose a novel curiosity-based learning algorithm for Multi-agent Reinforcement Learning (MARL) to attain efficient and effective decision-making. We employ the centralized training with decentralized execution framework (CTDE) and consider that each agent has knowledge of the prior action distribution of others. To quantify the difference in agents’ knowledge, curiosity, we introduce conditional mutual information (CMI) regularization and use the amount of information for updating decision-making policy. Then, to deploy these learning frameworks in a large-scale MARL setting while acquiring high sample efficiency, we consider a Kullback-Leibler (KL) divergence-based prioritization of experiences. We evaluate the effectiveness of the proposed algorithm in three different levels of StarCraft Multi Agent Challenge (SMAC) scenarios using the PyMARL framework. The simulation-based performance analysis shows that the proposed technique significantly improves the test win rate compared to various state-of-the-art MARL benchmarks, such as the Optimistically Weighted Monotonic Value Function Factorization (OW_QMIX) and Learning Individual Intrinsic Reward (LIIR).https://ieeexplore.ieee.org/document/9857920/Multi-agent reinforcement learningcuriosityconditional mutual informationprioritized experience replay
spellingShingle Devarani Devi Ningombam
Byunghyun Yoo
Hyun Woo Kim
Hwa Jeon Song
Sungwon Yi
CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
IEEE Access
Multi-agent reinforcement learning
curiosity
conditional mutual information
prioritized experience replay
title CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_full CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_fullStr CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_full_unstemmed CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_short CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_sort cumarl curiosity based learning in multiagent reinforcement learning
topic Multi-agent reinforcement learning
curiosity
conditional mutual information
prioritized experience replay
url https://ieeexplore.ieee.org/document/9857920/
work_keys_str_mv AT devaranideviningombam cumarlcuriositybasedlearninginmultiagentreinforcementlearning
AT byunghyunyoo cumarlcuriositybasedlearninginmultiagentreinforcementlearning
AT hyunwookim cumarlcuriositybasedlearninginmultiagentreinforcementlearning
AT hwajeonsong cumarlcuriositybasedlearninginmultiagentreinforcementlearning
AT sungwonyi cumarlcuriositybasedlearninginmultiagentreinforcementlearning