ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application

This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI’s ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition pro...

Full description

Bibliographic Details
Main Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10235949/
_version_ 1797688487622737920
author Naoki Wake
Atsushi Kanehira
Kazuhiro Sasabuchi
Jun Takamatsu
Katsushi Ikeuchi
author_facet Naoki Wake
Atsushi Kanehira
Kazuhiro Sasabuchi
Jun Takamatsu
Katsushi Ikeuchi
author_sort Naoki Wake
collection DOAJ
description This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI&#x2019;s ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition programs, adapt to various environments, and create multi-step task plans while mitigating the impact of token limit imposed on ChatGPT. In our approach, ChatGPT receives both instructions and textual environmental data, and outputs a task plan and an updated environment. These environmental data are reused in subsequent task planning, thus eliminating the extensive record-keeping of prior task plans within the prompts of ChatGPT. Experimental results demonstrated the effectiveness of these prompts across various domestic environments, such as manipulations in front of a shelf, a fridge, and a drawer. The conversational capability of ChatGPT allows users to adjust the output via natural-language feedback. Additionally, a quantitative evaluation using VirtualHome showed that our results are comparable to previous studies. Specifically, 36&#x0025; of task planning met both executability and correctness, and the rate approached 100&#x0025; after several rounds of feedback. Our experiments revealed that ChatGPT can reasonably plan tasks and estimate post-operation environments without actual experience in object manipulation. Despite the allure of ChatGPT-based task planning in robotics, a standardized methodology remains elusive, making our work a substantial contribution. These prompts can serve as customizable templates, offering practical resources for the robotics research community. Our prompts and source code are open source and publicly available at <uri>https://github.com/microsoft/ChatGPT-Robot-Manipulation-Prompts</uri>.
first_indexed 2024-03-12T01:32:40Z
format Article
id doaj.art-d310831d5296475aa84f1dabd9cfc6d2
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-12T01:32:40Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-d310831d5296475aa84f1dabd9cfc6d22023-09-11T23:02:08ZengIEEEIEEE Access2169-35362023-01-0111950609507810.1109/ACCESS.2023.331093510235949ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case ApplicationNaoki Wake0https://orcid.org/0000-0001-8278-2373Atsushi Kanehira1Kazuhiro Sasabuchi2https://orcid.org/0000-0002-5408-3089Jun Takamatsu3https://orcid.org/0000-0001-7457-2878Katsushi Ikeuchi4https://orcid.org/0000-0001-9758-9357Applied Robotics Research, Microsoft, Redmond, WA, USAApplied Robotics Research, Microsoft, Redmond, WA, USAApplied Robotics Research, Microsoft, Redmond, WA, USAApplied Robotics Research, Microsoft, Redmond, WA, USAApplied Robotics Research, Microsoft, Redmond, WA, USAThis paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI&#x2019;s ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition programs, adapt to various environments, and create multi-step task plans while mitigating the impact of token limit imposed on ChatGPT. In our approach, ChatGPT receives both instructions and textual environmental data, and outputs a task plan and an updated environment. These environmental data are reused in subsequent task planning, thus eliminating the extensive record-keeping of prior task plans within the prompts of ChatGPT. Experimental results demonstrated the effectiveness of these prompts across various domestic environments, such as manipulations in front of a shelf, a fridge, and a drawer. The conversational capability of ChatGPT allows users to adjust the output via natural-language feedback. Additionally, a quantitative evaluation using VirtualHome showed that our results are comparable to previous studies. Specifically, 36&#x0025; of task planning met both executability and correctness, and the rate approached 100&#x0025; after several rounds of feedback. Our experiments revealed that ChatGPT can reasonably plan tasks and estimate post-operation environments without actual experience in object manipulation. Despite the allure of ChatGPT-based task planning in robotics, a standardized methodology remains elusive, making our work a substantial contribution. These prompts can serve as customizable templates, offering practical resources for the robotics research community. Our prompts and source code are open source and publicly available at <uri>https://github.com/microsoft/ChatGPT-Robot-Manipulation-Prompts</uri>.https://ieeexplore.ieee.org/document/10235949/Task planningrobot manipulationlarge language modelsChatGPT
spellingShingle Naoki Wake
Atsushi Kanehira
Kazuhiro Sasabuchi
Jun Takamatsu
Katsushi Ikeuchi
ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application
IEEE Access
Task planning
robot manipulation
large language models
ChatGPT
title ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application
title_full ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application
title_fullStr ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application
title_full_unstemmed ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application
title_short ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application
title_sort chatgpt empowered long step robot control in various environments a case application
topic Task planning
robot manipulation
large language models
ChatGPT
url https://ieeexplore.ieee.org/document/10235949/
work_keys_str_mv AT naokiwake chatgptempoweredlongsteprobotcontrolinvariousenvironmentsacaseapplication
AT atsushikanehira chatgptempoweredlongsteprobotcontrolinvariousenvironmentsacaseapplication
AT kazuhirosasabuchi chatgptempoweredlongsteprobotcontrolinvariousenvironmentsacaseapplication
AT juntakamatsu chatgptempoweredlongsteprobotcontrolinvariousenvironmentsacaseapplication
AT katsushiikeuchi chatgptempoweredlongsteprobotcontrolinvariousenvironmentsacaseapplication