Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse Rewards

Reinforcement learning is an effective approach for solving decision-making problems. However, when using reinforcement learning to solve maneuver decision-making with sparse rewards, it costs too much time for training, and the final performance may not be satisfactory. In order to overcome the sho...

Full description

Bibliographic Details
Main Authors:	Yujie Wei, Hongpeng Zhang, Yuan Wang, Changqiang Huang
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Maneuver decision-making curriculum learning reinforcement learning sparse rewards
Online Access:	https://ieeexplore.ieee.org/document/10188394/

_version_	1827893496980176896
author	Yujie Wei Hongpeng Zhang Yuan Wang Changqiang Huang
author_facet	Yujie Wei Hongpeng Zhang Yuan Wang Changqiang Huang
author_sort	Yujie Wei
collection	DOAJ
description	Reinforcement learning is an effective approach for solving decision-making problems. However, when using reinforcement learning to solve maneuver decision-making with sparse rewards, it costs too much time for training, and the final performance may not be satisfactory. In order to overcome the shortcomings, the method for maneuver decision-making based on curriculum learning and reinforcement learning is proposed. First, three curricula are designed to address the maneuver decision-making problem: angle curriculum, distance curriculum and hybrid curriculum. They are proposed according to the intuition that closer destinations are easier to arrive at. Then, they are used to train agents and compared with the original method without any curriculum. The training results show that angle curriculum can increase the speed and stability of training, and improve the performance of maneuver decision-making; distance curriculum can increase the speed and stability of agent training; hybrid curriculum is not better than the other curricula, because it makes the agent get stuck at the local optimum. The simulation results show that after training, the agent can handle the situations where targets come from different directions, and the maneuver decision-makings are rational, effective, and interpretable, whereas the method without curriculum is invalid.
first_indexed	2024-03-12T21:54:36Z
format	Article
id	doaj.art-e2d904be230b4be3bcbc7a20c6a1aaf7
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-12T21:54:36Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-e2d904be230b4be3bcbc7a20c6a1aaf72023-07-25T23:00:23ZengIEEEIEEE Access2169-35362023-01-0111735437355510.1109/ACCESS.2023.329709510188394Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse RewardsYujie Wei0Hongpeng Zhang1https://orcid.org/0000-0001-5644-0089Yuan Wang2https://orcid.org/0000-0002-1563-7405Changqiang Huang3https://orcid.org/0000-0002-3746-2343Institute of Aeronautics Engineering, Air Force Engineering University, Xi’an, ChinaInstitute of Aeronautics Engineering, Air Force Engineering University, Xi’an, ChinaInstitute of Aeronautics Engineering, Air Force Engineering University, Xi’an, ChinaInstitute of Aeronautics Engineering, Air Force Engineering University, Xi’an, ChinaReinforcement learning is an effective approach for solving decision-making problems. However, when using reinforcement learning to solve maneuver decision-making with sparse rewards, it costs too much time for training, and the final performance may not be satisfactory. In order to overcome the shortcomings, the method for maneuver decision-making based on curriculum learning and reinforcement learning is proposed. First, three curricula are designed to address the maneuver decision-making problem: angle curriculum, distance curriculum and hybrid curriculum. They are proposed according to the intuition that closer destinations are easier to arrive at. Then, they are used to train agents and compared with the original method without any curriculum. The training results show that angle curriculum can increase the speed and stability of training, and improve the performance of maneuver decision-making; distance curriculum can increase the speed and stability of agent training; hybrid curriculum is not better than the other curricula, because it makes the agent get stuck at the local optimum. The simulation results show that after training, the agent can handle the situations where targets come from different directions, and the maneuver decision-makings are rational, effective, and interpretable, whereas the method without curriculum is invalid.https://ieeexplore.ieee.org/document/10188394/Maneuver decision-makingcurriculum learningreinforcement learningsparse rewards
spellingShingle	Yujie Wei Hongpeng Zhang Yuan Wang Changqiang Huang Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse Rewards IEEE Access Maneuver decision-making curriculum learning reinforcement learning sparse rewards
title	Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse Rewards
title_full	Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse Rewards
title_fullStr	Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse Rewards
title_full_unstemmed	Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse Rewards
title_short	Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse Rewards
title_sort	autonomous maneuver decision making through curriculum learning and reinforcement learning with sparse rewards
topic	Maneuver decision-making curriculum learning reinforcement learning sparse rewards
url	https://ieeexplore.ieee.org/document/10188394/
work_keys_str_mv	AT yujiewei autonomousmaneuverdecisionmakingthroughcurriculumlearningandreinforcementlearningwithsparserewards AT hongpengzhang autonomousmaneuverdecisionmakingthroughcurriculumlearningandreinforcementlearningwithsparserewards AT yuanwang autonomousmaneuverdecisionmakingthroughcurriculumlearningandreinforcementlearningwithsparserewards AT changqianghuang autonomousmaneuverdecisionmakingthroughcurriculumlearningandreinforcementlearningwithsparserewards

Autonomous Maneuver Decision-Making Through Curriculum Learning and Reinforcement Learning With Sparse Rewards

Similar Items