Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning

Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifi...

Full description

Bibliographic Details
Main Authors:	Georgia Chalvatzaki, Ali Younes, Daljeet Nandha, An Thai Le, Leonardo F. R. Ribeiro, Iryna Gurevych
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2023-08-01
Series:	Frontiers in Robotics and AI
Subjects:	robot learning task planning grounding language models (LMs) pretrained models scene graphs
Online Access:	https://www.frontiersin.org/articles/10.3389/frobt.2023.1221739/full

_version_	1828718178431139840
author	Georgia Chalvatzaki Georgia Chalvatzaki Georgia Chalvatzaki Ali Younes Daljeet Nandha An Thai Le Leonardo F. R. Ribeiro Iryna Gurevych Iryna Gurevych
author_facet	Georgia Chalvatzaki Georgia Chalvatzaki Georgia Chalvatzaki Ali Younes Daljeet Nandha An Thai Le Leonardo F. R. Ribeiro Iryna Gurevych Iryna Gurevych
author_sort	Georgia Chalvatzaki
collection	DOAJ
description	Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially. Our method grounds the input of the LLM on the domain that is represented as a scene graph, enabling it to translate human requests into executable robot plans, thereby learning to reason over long-horizon tasks, as encountered in the ALFRED benchmark. We compare our approach with classical planning and baseline methods to examine the applicability and generalizability of LLM-based planners. Our findings suggest that the knowledge stored in an LLM can be effectively grounded to perform long-horizon task planning, demonstrating the promising potential for the future application of neuro-symbolic planning methods in robotics.
first_indexed	2024-03-12T14:39:59Z
format	Article
id	doaj.art-4fcfd322123f48e8bfb5e0661410de4c
institution	Directory Open Access Journal
issn	2296-9144
language	English
last_indexed	2024-03-12T14:39:59Z
publishDate	2023-08-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Robotics and AI
spelling	doaj.art-4fcfd322123f48e8bfb5e0661410de4c2023-08-16T09:28:51ZengFrontiers Media S.A.Frontiers in Robotics and AI2296-91442023-08-011010.3389/frobt.2023.12217391221739Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planningGeorgia Chalvatzaki0Georgia Chalvatzaki1Georgia Chalvatzaki2Ali Younes3Daljeet Nandha4An Thai Le5Leonardo F. R. Ribeiro6Iryna Gurevych7Iryna Gurevych8Computer Science Department, Technische Universität Darmstadt, Darmstadt, GermanyHessian.AI, Darmstadt, GermanyCenter for Mind, Brain and Behavior, University Marburg and JLU Giessen, Marburg, GermanyComputer Science Department, Technische Universität Darmstadt, Darmstadt, GermanyComputer Science Department, Technische Universität Darmstadt, Darmstadt, GermanyComputer Science Department, Technische Universität Darmstadt, Darmstadt, GermanyAmazon Alexa, Seattle, WA, United StatesComputer Science Department, Technische Universität Darmstadt, Darmstadt, GermanyHessian.AI, Darmstadt, GermanyLong-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially. Our method grounds the input of the LLM on the domain that is represented as a scene graph, enabling it to translate human requests into executable robot plans, thereby learning to reason over long-horizon tasks, as encountered in the ALFRED benchmark. We compare our approach with classical planning and baseline methods to examine the applicability and generalizability of LLM-based planners. Our findings suggest that the knowledge stored in an LLM can be effectively grounded to perform long-horizon task planning, demonstrating the promising potential for the future application of neuro-symbolic planning methods in robotics.https://www.frontiersin.org/articles/10.3389/frobt.2023.1221739/fullrobot learningtask planninggroundinglanguage models (LMs)pretrained modelsscene graphs
spellingShingle	Georgia Chalvatzaki Georgia Chalvatzaki Georgia Chalvatzaki Ali Younes Daljeet Nandha An Thai Le Leonardo F. R. Ribeiro Iryna Gurevych Iryna Gurevych Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning Frontiers in Robotics and AI robot learning task planning grounding language models (LMs) pretrained models scene graphs
title	Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning
title_full	Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning
title_fullStr	Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning
title_full_unstemmed	Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning
title_short	Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning
title_sort	learning to reason over scene graphs a case study of finetuning gpt 2 into a robot language model for grounded task planning
topic	robot learning task planning grounding language models (LMs) pretrained models scene graphs
url	https://www.frontiersin.org/articles/10.3389/frobt.2023.1221739/full
work_keys_str_mv	AT georgiachalvatzaki learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning AT georgiachalvatzaki learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning AT georgiachalvatzaki learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning AT aliyounes learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning AT daljeetnandha learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning AT anthaile learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning AT leonardofrribeiro learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning AT irynagurevych learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning AT irynagurevych learningtoreasonoverscenegraphsacasestudyoffinetuninggpt2intoarobotlanguagemodelforgroundedtaskplanning

Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning

Similar Items