Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
Deep Reinforcement Learning (DRL) methods are inefficient in the initial strategy exploration process due to the huge state space and action space in large-scale complex scenarios. This is becoming one of the bottlenecks in their application to large-scale game adversarial scenarios. This paper prop...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9931113/ |
_version_ | 1828135514088144896 |
---|---|
author | Jiayi Liu Gang Wang Xiangke Guo Siyuan Wang Qiang Fu |
author_facet | Jiayi Liu Gang Wang Xiangke Guo Siyuan Wang Qiang Fu |
author_sort | Jiayi Liu |
collection | DOAJ |
description | Deep Reinforcement Learning (DRL) methods are inefficient in the initial strategy exploration process due to the huge state space and action space in large-scale complex scenarios. This is becoming one of the bottlenecks in their application to large-scale game adversarial scenarios. This paper proposes a Safe reinforcement learning combined with Imitation learning for Task Assignment (SITA) method for a representative red-blue game confrontation scenario. Aiming at the problem of difficult sampling of Imitation Learning (IL), this paper combines human knowledge with adversarial rules to build a knowledge rule base; We propose the Imitation Learning with the Decoupled Network (ILDN) pre-training method to solve the problem of excessive initial invalid exploration; In order to reduce invalid exploration and improve the stability in the later stages of training, we incorporate Safe Reinforcement Learning (Safe RL) method after pre-training. Finally, we verified in the digital battlefield that the SITA method has higher training efficiency and strong generalization ability in large-scale complex scenarios. |
first_indexed | 2024-04-11T17:47:43Z |
format | Article |
id | doaj.art-7845d2c2e3884c9c88d1f3114cd57c39 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-11T17:47:43Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-7845d2c2e3884c9c88d1f3114cd57c392022-12-22T04:11:13ZengIEEEIEEE Access2169-35362022-01-011011440211441310.1109/ACCESS.2022.32176549931113Deep Reinforcement Learning Task Assignment Based on Domain KnowledgeJiayi Liu0https://orcid.org/0000-0003-2432-4627Gang Wang1https://orcid.org/0000-0002-8195-365XXiangke Guo2https://orcid.org/0000-0002-0235-8992Siyuan Wang3https://orcid.org/0000-0001-7003-4722Qiang Fu4https://orcid.org/0000-0002-1456-4216Air Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaAir Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaAir Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaAir Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaAir Defense and Antimissile School, Air Force Engineering University, Xi’an, ChinaDeep Reinforcement Learning (DRL) methods are inefficient in the initial strategy exploration process due to the huge state space and action space in large-scale complex scenarios. This is becoming one of the bottlenecks in their application to large-scale game adversarial scenarios. This paper proposes a Safe reinforcement learning combined with Imitation learning for Task Assignment (SITA) method for a representative red-blue game confrontation scenario. Aiming at the problem of difficult sampling of Imitation Learning (IL), this paper combines human knowledge with adversarial rules to build a knowledge rule base; We propose the Imitation Learning with the Decoupled Network (ILDN) pre-training method to solve the problem of excessive initial invalid exploration; In order to reduce invalid exploration and improve the stability in the later stages of training, we incorporate Safe Reinforcement Learning (Safe RL) method after pre-training. Finally, we verified in the digital battlefield that the SITA method has higher training efficiency and strong generalization ability in large-scale complex scenarios.https://ieeexplore.ieee.org/document/9931113/Deep reinforcement learningimitation learningknowledge rule basesafe reinforcement learningtask assignment |
spellingShingle | Jiayi Liu Gang Wang Xiangke Guo Siyuan Wang Qiang Fu Deep Reinforcement Learning Task Assignment Based on Domain Knowledge IEEE Access Deep reinforcement learning imitation learning knowledge rule base safe reinforcement learning task assignment |
title | Deep Reinforcement Learning Task Assignment Based on Domain Knowledge |
title_full | Deep Reinforcement Learning Task Assignment Based on Domain Knowledge |
title_fullStr | Deep Reinforcement Learning Task Assignment Based on Domain Knowledge |
title_full_unstemmed | Deep Reinforcement Learning Task Assignment Based on Domain Knowledge |
title_short | Deep Reinforcement Learning Task Assignment Based on Domain Knowledge |
title_sort | deep reinforcement learning task assignment based on domain knowledge |
topic | Deep reinforcement learning imitation learning knowledge rule base safe reinforcement learning task assignment |
url | https://ieeexplore.ieee.org/document/9931113/ |
work_keys_str_mv | AT jiayiliu deepreinforcementlearningtaskassignmentbasedondomainknowledge AT gangwang deepreinforcementlearningtaskassignmentbasedondomainknowledge AT xiangkeguo deepreinforcementlearningtaskassignmentbasedondomainknowledge AT siyuanwang deepreinforcementlearningtaskassignmentbasedondomainknowledge AT qiangfu deepreinforcementlearningtaskassignmentbasedondomainknowledge |