Value-based subgoal discovery and path planning for reaching long-horizon goals

Learning to reach long-horizon goals in spatial traversal tasks is a significant challenge for autonomous agents. Recent subgoal graph-based planning methods address this challenge by decomposing a goal into a sequence of shorter-horizon subgoals. These methods, however, use arbitrary heuristics for...

Full description

Bibliographic Details
Main Authors:	Pateria, Shubham, Subagdja, Budhitama, Tan, Ah-Hwee, Quek, Chai
Other Authors:	School of Computer Science and Engineering
Format:	Journal Article
Language:	English
Published:	2023
Subjects:	Engineering::Computer science and engineering Motion Planning Path Planning
Online Access:	https://hdl.handle.net/10356/170579

_version_	1826124317637738496
author	Pateria, Shubham Subagdja, Budhitama Tan, Ah-Hwee Quek, Chai
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Pateria, Shubham Subagdja, Budhitama Tan, Ah-Hwee Quek, Chai
author_sort	Pateria, Shubham
collection	NTU
description	Learning to reach long-horizon goals in spatial traversal tasks is a significant challenge for autonomous agents. Recent subgoal graph-based planning methods address this challenge by decomposing a goal into a sequence of shorter-horizon subgoals. These methods, however, use arbitrary heuristics for sampling or discovering subgoals, which may not conform to the cumulative reward distribution. Moreover, they are prone to learning erroneous connections (edges) between subgoals, especially those lying across obstacles. To address these issues, this article proposes a novel subgoal graph-based planning method called learning subgoal graph using value-based subgoal discovery and automatic pruning (LSGVP). The proposed method uses a subgoal discovery heuristic that is based on a cumulative reward (value) measure and yields sparse subgoals, including those lying on the higher cumulative reward paths. Moreover, LSGVP guides the agent to automatically prune the learned subgoal graph to remove the erroneous edges. The combination of these novel features helps the LSGVP agent to achieve higher cumulative positive rewards than other subgoal sampling or discovery heuristics, as well as higher goal-reaching success rates than other state-of-the-art subgoal graph-based planning methods.
first_indexed	2024-10-01T06:18:40Z
format	Journal Article
id	ntu-10356/170579
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T06:18:40Z
publishDate	2023
record_format	dspace
spelling	ntu-10356/1705792023-09-19T08:52:55Z Value-based subgoal discovery and path planning for reaching long-horizon goals Pateria, Shubham Subagdja, Budhitama Tan, Ah-Hwee Quek, Chai School of Computer Science and Engineering Engineering::Computer science and engineering Motion Planning Path Planning Learning to reach long-horizon goals in spatial traversal tasks is a significant challenge for autonomous agents. Recent subgoal graph-based planning methods address this challenge by decomposing a goal into a sequence of shorter-horizon subgoals. These methods, however, use arbitrary heuristics for sampling or discovering subgoals, which may not conform to the cumulative reward distribution. Moreover, they are prone to learning erroneous connections (edges) between subgoals, especially those lying across obstacles. To address these issues, this article proposes a novel subgoal graph-based planning method called learning subgoal graph using value-based subgoal discovery and automatic pruning (LSGVP). The proposed method uses a subgoal discovery heuristic that is based on a cumulative reward (value) measure and yields sparse subgoals, including those lying on the higher cumulative reward paths. Moreover, LSGVP guides the agent to automatically prune the learned subgoal graph to remove the erroneous edges. The combination of these novel features helps the LSGVP agent to achieve higher cumulative positive rewards than other subgoal sampling or discovery heuristics, as well as higher goal-reaching success rates than other state-of-the-art subgoal graph-based planning methods. 2023-09-19T08:52:55Z 2023-09-19T08:52:55Z 2023 Journal Article Pateria, S., Subagdja, B., Tan, A. & Quek, C. (2023). Value-based subgoal discovery and path planning for reaching long-horizon goals. IEEE Transactions On Neural Networks and Learning Systems. https://dx.doi.org/10.1109/TNNLS.2023.3240004 2162-237X https://hdl.handle.net/10356/170579 10.1109/TNNLS.2023.3240004 37022814 2-s2.0-85148420396 en IEEE Transactions on Neural Networks and Learning Systems © 2023 IEEE. All rights reserved.
spellingShingle	Engineering::Computer science and engineering Motion Planning Path Planning Pateria, Shubham Subagdja, Budhitama Tan, Ah-Hwee Quek, Chai Value-based subgoal discovery and path planning for reaching long-horizon goals
title	Value-based subgoal discovery and path planning for reaching long-horizon goals
title_full	Value-based subgoal discovery and path planning for reaching long-horizon goals
title_fullStr	Value-based subgoal discovery and path planning for reaching long-horizon goals
title_full_unstemmed	Value-based subgoal discovery and path planning for reaching long-horizon goals
title_short	Value-based subgoal discovery and path planning for reaching long-horizon goals
title_sort	value based subgoal discovery and path planning for reaching long horizon goals
topic	Engineering::Computer science and engineering Motion Planning Path Planning
url	https://hdl.handle.net/10356/170579
work_keys_str_mv	AT pateriashubham valuebasedsubgoaldiscoveryandpathplanningforreachinglonghorizongoals AT subagdjabudhitama valuebasedsubgoaldiscoveryandpathplanningforreachinglonghorizongoals AT tanahhwee valuebasedsubgoaldiscoveryandpathplanningforreachinglonghorizongoals AT quekchai valuebasedsubgoaldiscoveryandpathplanningforreachinglonghorizongoals

Value-based subgoal discovery and path planning for reaching long-horizon goals

Similar Items