Deep Reinforcement Learning Task Assignment Based on Domain Knowledge

Deep Reinforcement Learning (DRL) methods are inefficient in the initial strategy exploration process due to the huge state space and action space in large-scale complex scenarios. This is becoming one of the bottlenecks in their application to large-scale game adversarial scenarios. This paper prop...

Full description

Bibliographic Details
Main Authors: Jiayi Liu, Gang Wang, Xiangke Guo, Siyuan Wang, Qiang Fu
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9931113/
_version_ 1828135514088144896
author Jiayi Liu
Gang Wang
Xiangke Guo
Siyuan Wang
Qiang Fu
author_facet Jiayi Liu
Gang Wang
Xiangke Guo
Siyuan Wang
Qiang Fu
author_sort Jiayi Liu
collection DOAJ
description Deep Reinforcement Learning (DRL) methods are inefficient in the initial strategy exploration process due to the huge state space and action space in large-scale complex scenarios. This is becoming one of the bottlenecks in their application to large-scale game adversarial scenarios. This paper proposes a Safe reinforcement learning combined with Imitation learning for Task Assignment (SITA) method for a representative red-blue game confrontation scenario. Aiming at the problem of difficult sampling of Imitation Learning (IL), this paper combines human knowledge with adversarial rules to build a knowledge rule base; We propose the Imitation Learning with the Decoupled Network (ILDN) pre-training method to solve the problem of excessive initial invalid exploration; In order to reduce invalid exploration and improve the stability in the later stages of training, we incorporate Safe Reinforcement Learning (Safe RL) method after pre-training. Finally, we verified in the digital battlefield that the SITA method has higher training efficiency and strong generalization ability in large-scale complex scenarios.
first_indexed 2024-04-11T17:47:43Z
format Article
id doaj.art-7845d2c2e3884c9c88d1f3114cd57c39
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T17:47:43Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-7845d2c2e3884c9c88d1f3114cd57c392022-12-22T04:11:13ZengIEEEIEEE Access2169-35362022-01-011011440211441310.1109/ACCESS.2022.32176549931113Deep Reinforcement Learning Task Assignment Based on Domain KnowledgeJiayi Liu0https://orcid.org/0000-0003-2432-4627Gang Wang1https://orcid.org/0000-0002-8195-365XXiangke Guo2https://orcid.org/0000-0002-0235-8992Siyuan Wang3https://orcid.org/0000-0001-7003-4722Qiang Fu4https://orcid.org/0000-0002-1456-4216Air Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaAir Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaAir Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaAir Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaAir Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaDeep Reinforcement Learning (DRL) methods are inefficient in the initial strategy exploration process due to the huge state space and action space in large-scale complex scenarios. This is becoming one of the bottlenecks in their application to large-scale game adversarial scenarios. This paper proposes a Safe reinforcement learning combined with Imitation learning for Task Assignment (SITA) method for a representative red-blue game confrontation scenario. Aiming at the problem of difficult sampling of Imitation Learning (IL), this paper combines human knowledge with adversarial rules to build a knowledge rule base; We propose the Imitation Learning with the Decoupled Network (ILDN) pre-training method to solve the problem of excessive initial invalid exploration; In order to reduce invalid exploration and improve the stability in the later stages of training, we incorporate Safe Reinforcement Learning (Safe RL) method after pre-training. Finally, we verified in the digital battlefield that the SITA method has higher training efficiency and strong generalization ability in large-scale complex scenarios.https://ieeexplore.ieee.org/document/9931113/Deep reinforcement learningimitation learningknowledge rule basesafe reinforcement learningtask assignment
spellingShingle Jiayi Liu
Gang Wang
Xiangke Guo
Siyuan Wang
Qiang Fu
Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
IEEE Access
Deep reinforcement learning
imitation learning
knowledge rule base
safe reinforcement learning
task assignment
title Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
title_full Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
title_fullStr Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
title_full_unstemmed Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
title_short Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
title_sort deep reinforcement learning task assignment based on domain knowledge
topic Deep reinforcement learning
imitation learning
knowledge rule base
safe reinforcement learning
task assignment
url https://ieeexplore.ieee.org/document/9931113/
work_keys_str_mv AT jiayiliu deepreinforcementlearningtaskassignmentbasedondomainknowledge
AT gangwang deepreinforcementlearningtaskassignmentbasedondomainknowledge
AT xiangkeguo deepreinforcementlearningtaskassignmentbasedondomainknowledge
AT siyuanwang deepreinforcementlearningtaskassignmentbasedondomainknowledge
AT qiangfu deepreinforcementlearningtaskassignmentbasedondomainknowledge