Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
We propose a deep reinforcement learning approach for solving a mapless navigation problem in warehouse scenarios. In our approach, an automatic guided vehicle is equipped with two LiDAR sensors and one frontal RGB camera and learns to perform a targeted navigation task. The challenges reside in the...
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2022-03-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/12/6/3153 |
| _version_ | 1827649930760552448 |
|---|---|
| author | Honghu Xue Benedikt Hein Mohamed Bakr Georg Schildbach Bengt Abel Elmar Rueckert |
| author_facet | Honghu Xue Benedikt Hein Mohamed Bakr Georg Schildbach Bengt Abel Elmar Rueckert |
| author_sort | Honghu Xue |
| collection | DOAJ |
| description | We propose a deep reinforcement learning approach for solving a mapless navigation problem in warehouse scenarios. In our approach, an automatic guided vehicle is equipped with two LiDAR sensors and one frontal RGB camera and learns to perform a targeted navigation task. The challenges reside in the sparseness of positive samples for learning, multi-modal sensor perception with partial observability, the demand for accurate steering maneuvers together with long training cycles. To address these points, we propose <i>NavACL-Q</i> as an automatic curriculum learning method in combination with a distributed version of the soft actor-critic algorithm. The performance of the learning algorithm is evaluated exhaustively in a different warehouse environment to validate both robustness and generalizability of the learned policy. Results in NVIDIA Isaac Sim demonstrates that our trained agent significantly outperforms the map-based navigation pipeline provided by NVIDIA Isaac Sim with an increased agent-goal distance of 3 m and a wider initial relative agent-goal rotation of approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mn>45</mn><mo>∘</mo></msup></semantics></math></inline-formula>. The ablation studies also suggest that <i>NavACL-Q</i> greatly facilitates the whole learning process with a performance gain of roughly <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>40</mn><mo>%</mo></mrow></semantics></math></inline-formula> compared to training with random starts and a pre-trained feature extractor manifestly boosts the performance by approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>60</mn><mo>%</mo></mrow></semantics></math></inline-formula>. |
| first_indexed | 2024-03-09T20:08:24Z |
| format | Article |
| id | doaj.art-16d76c5fd3064df4be12085493af5fed |
| institution | Directory Open Access Journal |
| issn | 2076-3417 |
| language | English |
| last_indexed | 2024-03-09T20:08:24Z |
| publishDate | 2022-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj.art-16d76c5fd3064df4be12085493af5fed2023-11-24T00:24:36ZengMDPI AGApplied Sciences2076-34172022-03-01126315310.3390/app12063153Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in IntralogisticsHonghu Xue0Benedikt Hein1Mohamed Bakr2Georg Schildbach3Bengt Abel4Elmar Rueckert5Institute for Robotics and Cognitive Systems, University of Luebeck, 23562 Luebeck, GermanyInstitute for Robotics and Cognitive Systems, University of Luebeck, 23562 Luebeck, GermanyKION Group AG, Technology and Innovation, 22113 Hamburg, GermanyInstitute for Electrical Engineering in Medicine, University of Luebeck, 23562 Luebeck, GermanyKION Group AG, Technology and Innovation, 22113 Hamburg, GermanyInstitute for Cyber Physical Systems, Montanuniversität Leoben, 8700 Leoben, AustriaWe propose a deep reinforcement learning approach for solving a mapless navigation problem in warehouse scenarios. In our approach, an automatic guided vehicle is equipped with two LiDAR sensors and one frontal RGB camera and learns to perform a targeted navigation task. The challenges reside in the sparseness of positive samples for learning, multi-modal sensor perception with partial observability, the demand for accurate steering maneuvers together with long training cycles. To address these points, we propose <i>NavACL-Q</i> as an automatic curriculum learning method in combination with a distributed version of the soft actor-critic algorithm. The performance of the learning algorithm is evaluated exhaustively in a different warehouse environment to validate both robustness and generalizability of the learned policy. Results in NVIDIA Isaac Sim demonstrates that our trained agent significantly outperforms the map-based navigation pipeline provided by NVIDIA Isaac Sim with an increased agent-goal distance of 3 m and a wider initial relative agent-goal rotation of approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mn>45</mn><mo>∘</mo></msup></semantics></math></inline-formula>. The ablation studies also suggest that <i>NavACL-Q</i> greatly facilitates the whole learning process with a performance gain of roughly <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>40</mn><mo>%</mo></mrow></semantics></math></inline-formula> compared to training with random starts and a pre-trained feature extractor manifestly boosts the performance by approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>60</mn><mo>%</mo></mrow></semantics></math></inline-formula>.https://www.mdpi.com/2076-3417/12/6/3153deep reinforcement learningautomatic curriculum learningautonomous navigationmulti-modal sensor perception |
| spellingShingle | Honghu Xue Benedikt Hein Mohamed Bakr Georg Schildbach Bengt Abel Elmar Rueckert Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics Applied Sciences deep reinforcement learning automatic curriculum learning autonomous navigation multi-modal sensor perception |
| title | Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics |
| title_full | Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics |
| title_fullStr | Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics |
| title_full_unstemmed | Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics |
| title_short | Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics |
| title_sort | using deep reinforcement learning with automatic curriculum learning for mapless navigation in intralogistics |
| topic | deep reinforcement learning automatic curriculum learning autonomous navigation multi-modal sensor perception |
| url | https://www.mdpi.com/2076-3417/12/6/3153 |
| work_keys_str_mv | AT honghuxue usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics AT benedikthein usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics AT mohamedbakr usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics AT georgschildbach usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics AT bengtabel usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics AT elmarrueckert usingdeepreinforcementlearningwithautomaticcurriculumlearningformaplessnavigationinintralogistics |