Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics

We propose a deep reinforcement learning approach for solving a mapless navigation problem in warehouse scenarios. In our approach, an automatic guided vehicle is equipped with two LiDAR sensors and one frontal RGB camera and learns to perform a targeted navigation task. The challenges reside in the...

Full description

Bibliographic Details
Main Authors: Honghu Xue, Benedikt Hein, Mohamed Bakr, Georg Schildbach, Bengt Abel, Elmar Rueckert
Format: Article
Language:English
Published: MDPI AG 2022-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/6/3153
_version_ 1827649930760552448
author Honghu Xue
Benedikt Hein
Mohamed Bakr
Georg Schildbach
Bengt Abel
Elmar Rueckert
author_facet Honghu Xue
Benedikt Hein
Mohamed Bakr
Georg Schildbach
Bengt Abel
Elmar Rueckert
author_sort Honghu Xue
collection DOAJ
description We propose a deep reinforcement learning approach for solving a mapless navigation problem in warehouse scenarios. In our approach, an automatic guided vehicle is equipped with two LiDAR sensors and one frontal RGB camera and learns to perform a targeted navigation task. The challenges reside in the sparseness of positive samples for learning, multi-modal sensor perception with partial observability, the demand for accurate steering maneuvers together with long training cycles. To address these points, we propose <i>NavACL-Q</i> as an automatic curriculum learning method in combination with a distributed version of the soft actor-critic algorithm. The performance of the learning algorithm is evaluated exhaustively in a different warehouse environment to validate both robustness and generalizability of the learned policy. Results in NVIDIA Isaac Sim demonstrates that our trained agent significantly outperforms the map-based navigation pipeline provided by NVIDIA Isaac Sim with an increased agent-goal distance of 3 m and a wider initial relative agent-goal rotation of approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mn>45</mn><mo>∘</mo></msup></semantics></math></inline-formula>. The ablation studies also suggest that <i>NavACL-Q</i> greatly facilitates the whole learning process with a performance gain of roughly <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>40</mn><mo>%</mo></mrow></semantics></math></inline-formula> compared to training with random starts and a pre-trained feature extractor manifestly boosts the performance by approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>60</mn><mo>%</mo></mrow></semantics></math></inline-formula>.
first_indexed 2024-03-09T20:08:24Z
format Article
id doaj.art-16d76c5fd3064df4be12085493af5fed
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T20:08:24Z
publishDate 2022-03-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-16d76c5fd3064df4be12085493af5fed2023-11-24T00:24:36ZengMDPI AGApplied Sciences2076-34172022-03-01126315310.3390/app12063153Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in IntralogisticsHonghu Xue0Benedikt Hein1Mohamed Bakr2Georg Schildbach3Bengt Abel4Elmar Rueckert5Institute for Robotics and Cognitive Systems, University of Luebeck, 23562 Luebeck, GermanyInstitute for Robotics and Cognitive Systems, University of Luebeck, 23562 Luebeck, GermanyKION Group AG, Technology and Innovation, 22113 Hamburg, GermanyInstitute for Electrical Engineering in Medicine, University of Luebeck, 23562 Luebeck, GermanyKION Group AG, Technology and Innovation, 22113 Hamburg, GermanyInstitute for Cyber Physical Systems, Montanuniversität Leoben, 8700 Leoben, AustriaWe propose a deep reinforcement learning approach for solving a mapless navigation problem in warehouse scenarios. In our approach, an automatic guided vehicle is equipped with two LiDAR sensors and one frontal RGB camera and learns to perform a targeted navigation task. The challenges reside in the sparseness of positive samples for learning, multi-modal sensor perception with partial observability, the demand for accurate steering maneuvers together with long training cycles. To address these points, we propose <i>NavACL-Q</i> as an automatic curriculum learning method in combination with a distributed version of the soft actor-critic algorithm. The performance of the learning algorithm is evaluated exhaustively in a different warehouse environment to validate both robustness and generalizability of the learned policy. Results in NVIDIA Isaac Sim demonstrates that our trained agent significantly outperforms the map-based navigation pipeline provided by NVIDIA Isaac Sim with an increased agent-goal distance of 3 m and a wider initial relative agent-goal rotation of approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mn>45</mn><mo>∘</mo></msup></semantics></math></inline-formula>. The ablation studies also suggest that <i>NavACL-Q</i> greatly facilitates the whole learning process with a performance gain of roughly <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>40</mn><mo>%</mo></mrow></semantics></math></inline-formula> compared to training with random starts and a pre-trained feature extractor manifestly boosts the performance by approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>60</mn><mo>%</mo></mrow></semantics></math></inline-formula>.https://www.mdpi.com/2076-3417/12/6/3153deep reinforcement learningautomatic curriculum learningautonomous navigationmulti-modal sensor perception
spellingShingle Honghu Xue
Benedikt Hein
Mohamed Bakr
Georg Schildbach
Bengt Abel
Elmar Rueckert
Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
Applied Sciences
deep reinforcement learning
automatic curriculum learning
autonomous navigation
multi-modal sensor perception
title Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
title_full Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
title_fullStr Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
title_full_unstemmed Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
title_short Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
title_sort using deep reinforcement learning with automatic curriculum learning for mapless navigation in intralogistics
topic deep reinforcement learning
automatic curriculum learning
autonomous navigation
multi-modal sensor perception
url https://www.mdpi.com/2076-3417/12/6/3153
work_keys_str_mv AT honghuxue usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics
AT benedikthein usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics
AT mohamedbakr usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics
AT georgschildbach usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics
AT bengtabel usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics
AT elmarrueckert usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics