EMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action masking

Abstract Autonomous exploration is a critical technology to realize robotic intelligence as it allows unsupervised preparation for future tasks and facilitates flexible deployment. In this paper, a novel Deep Reinforcement Learning (DRL) based autonomous exploration strategy is proposed to efficient...

Full description

Bibliographic Details
Main Authors:	Bolei Chen, Ping Zhong, Yongzheng Cui, Siyi Lu, Yixiong Liang, Yu Sheng
Format:	Article
Language:	English
Published:	Springer 2023-06-01
Series:	Complex & Intelligent Systems
Subjects:	Autonomous exploration Episodic memory Deep reinforcement learning Generalized Voronoi diagram Invalid action masking
Online Access:	https://doi.org/10.1007/s40747-023-01144-x

_version_	1797647097231572992
author	Bolei Chen Ping Zhong Yongzheng Cui Siyi Lu Yixiong Liang Yu Sheng
author_facet	Bolei Chen Ping Zhong Yongzheng Cui Siyi Lu Yixiong Liang Yu Sheng
author_sort	Bolei Chen
collection	DOAJ
description	Abstract Autonomous exploration is a critical technology to realize robotic intelligence as it allows unsupervised preparation for future tasks and facilitates flexible deployment. In this paper, a novel Deep Reinforcement Learning (DRL) based autonomous exploration strategy is proposed to efficiently reduce the unknown area of the workspace and provide accurate 2D map construction for mobile robots. Different from existing human-designed exploration techniques that usually make strong assumptions about the scenarios and the tasks, we utilize a model-free method to directly learn an exploration strategy through trial-and-error interactions with complex environments. To be specific, the Generalized Voronoi Diagram (GVD) is first utilized for domain conversion to obtain a high-dimensional Topological Environmental Representation (TER). Then, the Generalized Voronoi Networks (GVN) with spatial awareness and episodic memory is designed to learn autonomous exploration policies interactively online. For complete and efficient exploration, Invalid Action Masking (IAM) is employed to reshape the configuration space of exploration tasks to cope with the explosion of action space and observation space caused by the expansion of the exploration range. Furthermore, a well-designed reward function is leveraged to guide the learning of policies. Extensive baseline tests and comparative simulations show that our strategy outperforms the state-of-the-art strategies in terms of map quality and exploration speed. Sufficient ablation studies and mobile robot experiments demonstrate the effectiveness and superiority of our strategy.
first_indexed	2024-03-11T15:11:26Z
format	Article
id	doaj.art-12ea86a4e8584c4cbdb61c58de857c7a
institution	Directory Open Access Journal
issn	2199-4536 2198-6053
language	English
last_indexed	2024-03-11T15:11:26Z
publishDate	2023-06-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj.art-12ea86a4e8584c4cbdb61c58de857c7a2023-10-29T12:41:40ZengSpringerComplex & Intelligent Systems2199-45362198-60532023-06-01967365737910.1007/s40747-023-01144-xEMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action maskingBolei Chen0Ping Zhong1Yongzheng Cui2Siyi Lu3Yixiong Liang4Yu Sheng5School of Computer Science and Engineering, Central South UniversitySchool of Computer Science and Engineering, Central South UniversitySchool of Computer Science and Engineering, Central South UniversitySchool of Computer Science and Engineering, Central South UniversitySchool of Computer Science and Engineering, Central South UniversitySchool of Computer Science and Engineering, Central South UniversityAbstract Autonomous exploration is a critical technology to realize robotic intelligence as it allows unsupervised preparation for future tasks and facilitates flexible deployment. In this paper, a novel Deep Reinforcement Learning (DRL) based autonomous exploration strategy is proposed to efficiently reduce the unknown area of the workspace and provide accurate 2D map construction for mobile robots. Different from existing human-designed exploration techniques that usually make strong assumptions about the scenarios and the tasks, we utilize a model-free method to directly learn an exploration strategy through trial-and-error interactions with complex environments. To be specific, the Generalized Voronoi Diagram (GVD) is first utilized for domain conversion to obtain a high-dimensional Topological Environmental Representation (TER). Then, the Generalized Voronoi Networks (GVN) with spatial awareness and episodic memory is designed to learn autonomous exploration policies interactively online. For complete and efficient exploration, Invalid Action Masking (IAM) is employed to reshape the configuration space of exploration tasks to cope with the explosion of action space and observation space caused by the expansion of the exploration range. Furthermore, a well-designed reward function is leveraged to guide the learning of policies. Extensive baseline tests and comparative simulations show that our strategy outperforms the state-of-the-art strategies in terms of map quality and exploration speed. Sufficient ablation studies and mobile robot experiments demonstrate the effectiveness and superiority of our strategy.https://doi.org/10.1007/s40747-023-01144-xAutonomous explorationEpisodic memoryDeep reinforcement learningGeneralized Voronoi diagramInvalid action masking
spellingShingle	Bolei Chen Ping Zhong Yongzheng Cui Siyi Lu Yixiong Liang Yu Sheng EMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action masking Complex & Intelligent Systems Autonomous exploration Episodic memory Deep reinforcement learning Generalized Voronoi diagram Invalid action masking
title	EMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action masking
title_full	EMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action masking
title_fullStr	EMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action masking
title_full_unstemmed	EMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action masking
title_short	EMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action masking
title_sort	emexplorer an episodic memory enhanced autonomous exploration strategy with voronoi domain conversion and invalid action masking
topic	Autonomous exploration Episodic memory Deep reinforcement learning Generalized Voronoi diagram Invalid action masking
url	https://doi.org/10.1007/s40747-023-01144-x
work_keys_str_mv	AT boleichen emexploreranepisodicmemoryenhancedautonomousexplorationstrategywithvoronoidomainconversionandinvalidactionmasking AT pingzhong emexploreranepisodicmemoryenhancedautonomousexplorationstrategywithvoronoidomainconversionandinvalidactionmasking AT yongzhengcui emexploreranepisodicmemoryenhancedautonomousexplorationstrategywithvoronoidomainconversionandinvalidactionmasking AT siyilu emexploreranepisodicmemoryenhancedautonomousexplorationstrategywithvoronoidomainconversionandinvalidactionmasking AT yixiongliang emexploreranepisodicmemoryenhancedautonomousexplorationstrategywithvoronoidomainconversionandinvalidactionmasking AT yusheng emexploreranepisodicmemoryenhancedautonomousexplorationstrategywithvoronoidomainconversionandinvalidactionmasking

EMExplorer: an episodic memory enhanced autonomous exploration strategy with Voronoi domain conversion and invalid action masking

Similar Items