Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings
Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and th...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-12-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/24/1/185 |
_version_ | 1797358134767910912 |
---|---|
author | Vidura Sumanasena Heshan Fernando Daswin De Silva Beniel Thileepan Amila Pasan Jayathu Samarawickrama Evgeny Osipov Damminda Alahakoon |
author_facet | Vidura Sumanasena Heshan Fernando Daswin De Silva Beniel Thileepan Amila Pasan Jayathu Samarawickrama Evgeny Osipov Damminda Alahakoon |
author_sort | Vidura Sumanasena |
collection | DOAJ |
description | Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings. |
first_indexed | 2024-03-08T14:57:28Z |
format | Article |
id | doaj.art-830b7d31c4ac4399be05fc3442e3df79 |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-08T14:57:28Z |
publishDate | 2023-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-830b7d31c4ac4399be05fc3442e3df792024-01-10T15:08:57ZengMDPI AGSensors1424-82202023-12-0124118510.3390/s24010185Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained SettingsVidura Sumanasena0Heshan Fernando1Daswin De Silva2Beniel Thileepan3Amila Pasan4Jayathu Samarawickrama5Evgeny Osipov6Damminda Alahakoon7Research Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, AustraliaDepartment of Computer Engineering, Rensselaer Polytechnic Institute, New York, NY 12180, USAResearch Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, AustraliaDepartment of Computer Science, University of Warwick, Coventry CV4 7AL, UKCentre for Wireless Communications, University of Oulu, 90570 Oulu, FinlandDepartment of Electronic and Telecom Engineering, University of Moratuwa, Moratuwa 10400, Sri LankaDepartment of Computer Science, Electrical and Space Engineering, Luleå University of Technology, 97187 Luleå, SwedenResearch Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, AustraliaDirect policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings.https://www.mdpi.com/1424-8220/24/1/185imitation learningdirect policy learningautonomous navigationmobile robots |
spellingShingle | Vidura Sumanasena Heshan Fernando Daswin De Silva Beniel Thileepan Amila Pasan Jayathu Samarawickrama Evgeny Osipov Damminda Alahakoon Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings Sensors imitation learning direct policy learning autonomous navigation mobile robots |
title | Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings |
title_full | Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings |
title_fullStr | Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings |
title_full_unstemmed | Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings |
title_short | Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings |
title_sort | hardware efficient direct policy imitation learning for robotic navigation in resource constrained settings |
topic | imitation learning direct policy learning autonomous navigation mobile robots |
url | https://www.mdpi.com/1424-8220/24/1/185 |
work_keys_str_mv | AT vidurasumanasena hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT heshanfernando hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT daswindesilva hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT benielthileepan hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT amilapasan hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT jayathusamarawickrama hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT evgenyosipov hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT dammindaalahakoon hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings |