Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings

Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and th...

Full description

Bibliographic Details
Main Authors:	Vidura Sumanasena, Heshan Fernando, Daswin De Silva, Beniel Thileepan, Amila Pasan, Jayathu Samarawickrama, Evgeny Osipov, Damminda Alahakoon
Format:	Article
Language:	English
Published:	MDPI AG 2023-12-01
Series:	Sensors
Subjects:	imitation learning direct policy learning autonomous navigation mobile robots
Online Access:	https://www.mdpi.com/1424-8220/24/1/185

_version_	1827384422454788096
author	Vidura Sumanasena Heshan Fernando Daswin De Silva Beniel Thileepan Amila Pasan Jayathu Samarawickrama Evgeny Osipov Damminda Alahakoon
author_facet	Vidura Sumanasena Heshan Fernando Daswin De Silva Beniel Thileepan Amila Pasan Jayathu Samarawickrama Evgeny Osipov Damminda Alahakoon
author_sort	Vidura Sumanasena
collection	DOAJ
description	Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings.
first_indexed	2024-03-08T14:57:28Z
format	Article
id	doaj.art-830b7d31c4ac4399be05fc3442e3df79
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-08T14:57:28Z
publishDate	2023-12-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-830b7d31c4ac4399be05fc3442e3df792024-01-10T15:08:57ZengMDPI AGSensors1424-82202023-12-0124118510.3390/s24010185Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained SettingsVidura Sumanasena0Heshan Fernando1Daswin De Silva2Beniel Thileepan3Amila Pasan4Jayathu Samarawickrama5Evgeny Osipov6Damminda Alahakoon7Research Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, AustraliaDepartment of Computer Engineering, Rensselaer Polytechnic Institute, New York, NY 12180, USAResearch Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, AustraliaDepartment of Computer Science, University of Warwick, Coventry CV4 7AL, UKCentre for Wireless Communications, University of Oulu, 90570 Oulu, FinlandDepartment of Electronic and Telecom Engineering, University of Moratuwa, Moratuwa 10400, Sri LankaDepartment of Computer Science, Electrical and Space Engineering, Luleå University of Technology, 97187 Luleå, SwedenResearch Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, AustraliaDirect policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings.https://www.mdpi.com/1424-8220/24/1/185imitation learningdirect policy learningautonomous navigationmobile robots
spellingShingle	Vidura Sumanasena Heshan Fernando Daswin De Silva Beniel Thileepan Amila Pasan Jayathu Samarawickrama Evgeny Osipov Damminda Alahakoon Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings Sensors imitation learning direct policy learning autonomous navigation mobile robots
title	Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings
title_full	Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings
title_fullStr	Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings
title_full_unstemmed	Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings
title_short	Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings
title_sort	hardware efficient direct policy imitation learning for robotic navigation in resource constrained settings
topic	imitation learning direct policy learning autonomous navigation mobile robots
url	https://www.mdpi.com/1424-8220/24/1/185
work_keys_str_mv	AT vidurasumanasena hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT heshanfernando hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT daswindesilva hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT benielthileepan hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT amilapasan hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT jayathusamarawickrama hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT evgenyosipov hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings AT dammindaalahakoon hardwareefficientdirectpolicyimitationlearningforroboticnavigationinresourceconstrainedsettings

Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings

Similar Items