CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning

In this paper, we propose a novel curiosity-based learning algorithm for Multi-agent Reinforcement Learning (MARL) to attain efficient and effective decision-making. We employ the centralized training with decentralized execution framework (CTDE) and consider that each agent has knowledge of the pri...

Full description

Bibliographic Details
Main Authors:	Devarani Devi Ningombam, Byunghyun Yoo, Hyun Woo Kim, Hwa Jeon Song, Sungwon Yi
Format:	Article
Language:	English
Published:	IEEE 2022-01-01
Series:	IEEE Access
Subjects:	Multi-agent reinforcement learning curiosity conditional mutual information prioritized experience replay
Online Access:	https://ieeexplore.ieee.org/document/9857920/

_version_	1798036540025208832
author	Devarani Devi Ningombam Byunghyun Yoo Hyun Woo Kim Hwa Jeon Song Sungwon Yi
author_facet	Devarani Devi Ningombam Byunghyun Yoo Hyun Woo Kim Hwa Jeon Song Sungwon Yi
author_sort	Devarani Devi Ningombam
collection	DOAJ
description	In this paper, we propose a novel curiosity-based learning algorithm for Multi-agent Reinforcement Learning (MARL) to attain efficient and effective decision-making. We employ the centralized training with decentralized execution framework (CTDE) and consider that each agent has knowledge of the prior action distribution of others. To quantify the difference in agents’ knowledge, curiosity, we introduce conditional mutual information (CMI) regularization and use the amount of information for updating decision-making policy. Then, to deploy these learning frameworks in a large-scale MARL setting while acquiring high sample efficiency, we consider a Kullback-Leibler (KL) divergence-based prioritization of experiences. We evaluate the effectiveness of the proposed algorithm in three different levels of StarCraft Multi Agent Challenge (SMAC) scenarios using the PyMARL framework. The simulation-based performance analysis shows that the proposed technique significantly improves the test win rate compared to various state-of-the-art MARL benchmarks, such as the Optimistically Weighted Monotonic Value Function Factorization (OW_QMIX) and Learning Individual Intrinsic Reward (LIIR).
first_indexed	2024-04-11T21:14:20Z
format	Article
id	doaj.art-ea341c98ba52493b8fe43c46f77bd8a6
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-04-11T21:14:20Z
publishDate	2022-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-ea341c98ba52493b8fe43c46f77bd8a62022-12-22T04:02:53ZengIEEEIEEE Access2169-35362022-01-0110872548726510.1109/ACCESS.2022.31989819857920CuMARL: Curiosity-Based Learning in Multiagent Reinforcement LearningDevarani Devi Ningombam0https://orcid.org/0000-0002-6946-6584Byunghyun Yoo1https://orcid.org/0000-0003-0857-5565Hyun Woo Kim2Hwa Jeon Song3https://orcid.org/0000-0002-8216-4812Sungwon Yi4https://orcid.org/0000-0002-4986-9546Department of Informatics, School of Computer Science, University of Petroleum and Energy Studies (UPES), Dehradun, Uttarakhand, IndiaElectronics and Telecommunications Research Institute (ETRI), Daejeon, South KoreaElectronics and Telecommunications Research Institute (ETRI), Daejeon, South KoreaElectronics and Telecommunications Research Institute (ETRI), Daejeon, South KoreaElectronics and Telecommunications Research Institute (ETRI), Daejeon, South KoreaIn this paper, we propose a novel curiosity-based learning algorithm for Multi-agent Reinforcement Learning (MARL) to attain efficient and effective decision-making. We employ the centralized training with decentralized execution framework (CTDE) and consider that each agent has knowledge of the prior action distribution of others. To quantify the difference in agents’ knowledge, curiosity, we introduce conditional mutual information (CMI) regularization and use the amount of information for updating decision-making policy. Then, to deploy these learning frameworks in a large-scale MARL setting while acquiring high sample efficiency, we consider a Kullback-Leibler (KL) divergence-based prioritization of experiences. We evaluate the effectiveness of the proposed algorithm in three different levels of StarCraft Multi Agent Challenge (SMAC) scenarios using the PyMARL framework. The simulation-based performance analysis shows that the proposed technique significantly improves the test win rate compared to various state-of-the-art MARL benchmarks, such as the Optimistically Weighted Monotonic Value Function Factorization (OW_QMIX) and Learning Individual Intrinsic Reward (LIIR).https://ieeexplore.ieee.org/document/9857920/Multi-agent reinforcement learningcuriosityconditional mutual informationprioritized experience replay
spellingShingle	Devarani Devi Ningombam Byunghyun Yoo Hyun Woo Kim Hwa Jeon Song Sungwon Yi CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning IEEE Access Multi-agent reinforcement learning curiosity conditional mutual information prioritized experience replay
title	CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_full	CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_fullStr	CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_full_unstemmed	CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_short	CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning
title_sort	cumarl curiosity based learning in multiagent reinforcement learning
topic	Multi-agent reinforcement learning curiosity conditional mutual information prioritized experience replay
url	https://ieeexplore.ieee.org/document/9857920/
work_keys_str_mv	AT devaranideviningombam cumarlcuriositybasedlearninginmultiagentreinforcementlearning AT byunghyunyoo cumarlcuriositybasedlearninginmultiagentreinforcementlearning AT hyunwookim cumarlcuriositybasedlearninginmultiagentreinforcementlearning AT hwajeonsong cumarlcuriositybasedlearninginmultiagentreinforcementlearning AT sungwonyi cumarlcuriositybasedlearninginmultiagentreinforcementlearning

CuMARL: Curiosity-Based Learning in Multiagent Reinforcement Learning

Similar Items