Multi-Robot Flocking Control Based on Deep Reinforcement Learning

In this paper, we apply deep reinforcement learning (DRL) to solve the flocking control problem of multi-robot systems in complex environments with dynamic obstacles. Starting from the traditional flocking model, we propose a DRL framework for implementing multi-robot flocking control, eliminating t...

Full description

Bibliographic Details
Main Authors: Pengming Zhu, Wei Dai, Weijia Yao, Junchong Ma, Zhiwen Zeng, Huimin Lu
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9169650/
_version_ 1818431312240836608
author Pengming Zhu
Wei Dai
Weijia Yao
Junchong Ma
Zhiwen Zeng
Huimin Lu
author_facet Pengming Zhu
Wei Dai
Weijia Yao
Junchong Ma
Zhiwen Zeng
Huimin Lu
author_sort Pengming Zhu
collection DOAJ
description In this paper, we apply deep reinforcement learning (DRL) to solve the flocking control problem of multi-robot systems in complex environments with dynamic obstacles. Starting from the traditional flocking model, we propose a DRL framework for implementing multi-robot flocking control, eliminating the tedious work of modeling and control designing. We adopt the multi-agent deep deterministic policy gradient (MADDPG) algorithm, which additionally uses the information of multiple robots in the learning process to better predict the actions that robots will take. To address the problems such as low learning efficiency and slow convergence speed of the MADDPG algorithm, this paper studies a prioritized experience replay (PER) mechanism and proposes the Prioritized Experience Replay-MADDPG (PER-MADDPG) algorithm. Based on the temporal difference (TD) error, a priority evaluation function is designed to determine which experiences are sampled preferentially from the replay buffer. In the end, the simulation results verify the effectiveness of the proposed algorithm. It has a faster convergence speed and enables the robot group to complete the flocking task in the environment with obstacles.
first_indexed 2024-12-14T15:47:18Z
format Article
id doaj.art-17bec4a3249d473284b09a407ffc4828
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T15:47:18Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-17bec4a3249d473284b09a407ffc48282022-12-21T22:55:28ZengIEEEIEEE Access2169-35362020-01-01815039715040610.1109/ACCESS.2020.30169519169650Multi-Robot Flocking Control Based on Deep Reinforcement LearningPengming Zhu0https://orcid.org/0000-0002-2440-5331Wei Dai1Weijia Yao2https://orcid.org/0000-0003-0361-6620Junchong Ma3Zhiwen Zeng4Huimin Lu5https://orcid.org/0000-0002-6375-581XRobotics Research Center, College of Intelligence Science and Technology, National University of Defense Technology, Changsha, ChinaRobotics Research Center, College of Intelligence Science and Technology, National University of Defense Technology, Changsha, ChinaRobotics Research Center, College of Intelligence Science and Technology, National University of Defense Technology, Changsha, ChinaRobotics Research Center, College of Intelligence Science and Technology, National University of Defense Technology, Changsha, ChinaRobotics Research Center, College of Intelligence Science and Technology, National University of Defense Technology, Changsha, ChinaRobotics Research Center, College of Intelligence Science and Technology, National University of Defense Technology, Changsha, ChinaIn this paper, we apply deep reinforcement learning (DRL) to solve the flocking control problem of multi-robot systems in complex environments with dynamic obstacles. Starting from the traditional flocking model, we propose a DRL framework for implementing multi-robot flocking control, eliminating the tedious work of modeling and control designing. We adopt the multi-agent deep deterministic policy gradient (MADDPG) algorithm, which additionally uses the information of multiple robots in the learning process to better predict the actions that robots will take. To address the problems such as low learning efficiency and slow convergence speed of the MADDPG algorithm, this paper studies a prioritized experience replay (PER) mechanism and proposes the Prioritized Experience Replay-MADDPG (PER-MADDPG) algorithm. Based on the temporal difference (TD) error, a priority evaluation function is designed to determine which experiences are sampled preferentially from the replay buffer. In the end, the simulation results verify the effectiveness of the proposed algorithm. It has a faster convergence speed and enables the robot group to complete the flocking task in the environment with obstacles.https://ieeexplore.ieee.org/document/9169650/Multi-robotdeep reinforcement learningflocking controlPER-MADDPG
spellingShingle Pengming Zhu
Wei Dai
Weijia Yao
Junchong Ma
Zhiwen Zeng
Huimin Lu
Multi-Robot Flocking Control Based on Deep Reinforcement Learning
IEEE Access
Multi-robot
deep reinforcement learning
flocking control
PER-MADDPG
title Multi-Robot Flocking Control Based on Deep Reinforcement Learning
title_full Multi-Robot Flocking Control Based on Deep Reinforcement Learning
title_fullStr Multi-Robot Flocking Control Based on Deep Reinforcement Learning
title_full_unstemmed Multi-Robot Flocking Control Based on Deep Reinforcement Learning
title_short Multi-Robot Flocking Control Based on Deep Reinforcement Learning
title_sort multi robot flocking control based on deep reinforcement learning
topic Multi-robot
deep reinforcement learning
flocking control
PER-MADDPG
url https://ieeexplore.ieee.org/document/9169650/
work_keys_str_mv AT pengmingzhu multirobotflockingcontrolbasedondeepreinforcementlearning
AT weidai multirobotflockingcontrolbasedondeepreinforcementlearning
AT weijiayao multirobotflockingcontrolbasedondeepreinforcementlearning
AT junchongma multirobotflockingcontrolbasedondeepreinforcementlearning
AT zhiwenzeng multirobotflockingcontrolbasedondeepreinforcementlearning
AT huiminlu multirobotflockingcontrolbasedondeepreinforcementlearning