Stepwise Soft Actor–Critic for UAV Autonomous Flight Control
Despite the growing demand for unmanned aerial vehicles (UAVs), the use of conventional UAVs is limited, as most of them require being remotely operated by a person who is not within the vehicle’s field of view. Recently, many studies have introduced reinforcement learning (RL) to address hurdles fo...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-08-01
|
Series: | Drones |
Subjects: | |
Online Access: | https://www.mdpi.com/2504-446X/7/9/549 |
_version_ | 1827726487671799808 |
---|---|
author | Ha Jun Hwang Jaeyeon Jang Jongkwan Choi Jung Ho Bae Sung Ho Kim Chang Ouk Kim |
author_facet | Ha Jun Hwang Jaeyeon Jang Jongkwan Choi Jung Ho Bae Sung Ho Kim Chang Ouk Kim |
author_sort | Ha Jun Hwang |
collection | DOAJ |
description | Despite the growing demand for unmanned aerial vehicles (UAVs), the use of conventional UAVs is limited, as most of them require being remotely operated by a person who is not within the vehicle’s field of view. Recently, many studies have introduced reinforcement learning (RL) to address hurdles for the autonomous flight of UAVs. However, most previous studies have assumed overly simplified environments, and thus, they cannot be applied to real-world UAV operation scenarios. To address the limitations of previous studies, we propose a stepwise soft actor–critic (SeSAC) algorithm for efficient learning in a continuous state and action space environment. SeSAC aims to overcome the inefficiency of learning caused by attempting challenging tasks from the beginning. Instead, it starts with easier missions and gradually increases the difficulty level during training, ultimately achieving the final goal. We also control a learning hyperparameter of the soft actor–critic algorithm and implement a positive buffer mechanism during training to enhance learning effectiveness. Our proposed algorithm was verified in a six-degree-of-freedom (DOF) flight environment with high-dimensional state and action spaces. The experimental results demonstrate that the proposed algorithm successfully completed missions in two challenging scenarios, one for disaster management and another for counter-terrorism missions, while surpassing the performance of other baseline approaches. |
first_indexed | 2024-03-10T22:52:34Z |
format | Article |
id | doaj.art-d43e22f296094e51afcede28c003f988 |
institution | Directory Open Access Journal |
issn | 2504-446X |
language | English |
last_indexed | 2024-03-10T22:52:34Z |
publishDate | 2023-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Drones |
spelling | doaj.art-d43e22f296094e51afcede28c003f9882023-11-19T10:16:52ZengMDPI AGDrones2504-446X2023-08-017954910.3390/drones7090549Stepwise Soft Actor–Critic for UAV Autonomous Flight ControlHa Jun Hwang0Jaeyeon Jang1Jongkwan Choi2Jung Ho Bae3Sung Ho Kim4Chang Ouk Kim5Department of Industrial Engineering, Yonsei University, Seoul 03722, Republic of KoreaDepartment of Data Science, The Catholic University of Korea, Bucheon 14662, Republic of KoreaDepartment of Industrial Engineering, Yonsei University, Seoul 03722, Republic of KoreaDefense Artificial Intelligence Technology Center, Agency for Defense Development, Daejeon 34186, Republic of KoreaDefense Artificial Intelligence Technology Center, Agency for Defense Development, Daejeon 34186, Republic of KoreaDepartment of Industrial Engineering, Yonsei University, Seoul 03722, Republic of KoreaDespite the growing demand for unmanned aerial vehicles (UAVs), the use of conventional UAVs is limited, as most of them require being remotely operated by a person who is not within the vehicle’s field of view. Recently, many studies have introduced reinforcement learning (RL) to address hurdles for the autonomous flight of UAVs. However, most previous studies have assumed overly simplified environments, and thus, they cannot be applied to real-world UAV operation scenarios. To address the limitations of previous studies, we propose a stepwise soft actor–critic (SeSAC) algorithm for efficient learning in a continuous state and action space environment. SeSAC aims to overcome the inefficiency of learning caused by attempting challenging tasks from the beginning. Instead, it starts with easier missions and gradually increases the difficulty level during training, ultimately achieving the final goal. We also control a learning hyperparameter of the soft actor–critic algorithm and implement a positive buffer mechanism during training to enhance learning effectiveness. Our proposed algorithm was verified in a six-degree-of-freedom (DOF) flight environment with high-dimensional state and action spaces. The experimental results demonstrate that the proposed algorithm successfully completed missions in two challenging scenarios, one for disaster management and another for counter-terrorism missions, while surpassing the performance of other baseline approaches.https://www.mdpi.com/2504-446X/7/9/549reinforcement learningautonomous flight controlunmanned aerial vehiclesoft actor–criticJSBSim |
spellingShingle | Ha Jun Hwang Jaeyeon Jang Jongkwan Choi Jung Ho Bae Sung Ho Kim Chang Ouk Kim Stepwise Soft Actor–Critic for UAV Autonomous Flight Control Drones reinforcement learning autonomous flight control unmanned aerial vehicle soft actor–critic JSBSim |
title | Stepwise Soft Actor–Critic for UAV Autonomous Flight Control |
title_full | Stepwise Soft Actor–Critic for UAV Autonomous Flight Control |
title_fullStr | Stepwise Soft Actor–Critic for UAV Autonomous Flight Control |
title_full_unstemmed | Stepwise Soft Actor–Critic for UAV Autonomous Flight Control |
title_short | Stepwise Soft Actor–Critic for UAV Autonomous Flight Control |
title_sort | stepwise soft actor critic for uav autonomous flight control |
topic | reinforcement learning autonomous flight control unmanned aerial vehicle soft actor–critic JSBSim |
url | https://www.mdpi.com/2504-446X/7/9/549 |
work_keys_str_mv | AT hajunhwang stepwisesoftactorcriticforuavautonomousflightcontrol AT jaeyeonjang stepwisesoftactorcriticforuavautonomousflightcontrol AT jongkwanchoi stepwisesoftactorcriticforuavautonomousflightcontrol AT junghobae stepwisesoftactorcriticforuavautonomousflightcontrol AT sunghokim stepwisesoftactorcriticforuavautonomousflightcontrol AT changoukkim stepwisesoftactorcriticforuavautonomousflightcontrol |