ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application

This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI’s ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition pro...

Full description

Bibliographic Details
Main Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10235949/
Description
Summary:This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI&#x2019;s ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition programs, adapt to various environments, and create multi-step task plans while mitigating the impact of token limit imposed on ChatGPT. In our approach, ChatGPT receives both instructions and textual environmental data, and outputs a task plan and an updated environment. These environmental data are reused in subsequent task planning, thus eliminating the extensive record-keeping of prior task plans within the prompts of ChatGPT. Experimental results demonstrated the effectiveness of these prompts across various domestic environments, such as manipulations in front of a shelf, a fridge, and a drawer. The conversational capability of ChatGPT allows users to adjust the output via natural-language feedback. Additionally, a quantitative evaluation using VirtualHome showed that our results are comparable to previous studies. Specifically, 36&#x0025; of task planning met both executability and correctness, and the rate approached 100&#x0025; after several rounds of feedback. Our experiments revealed that ChatGPT can reasonably plan tasks and estimate post-operation environments without actual experience in object manipulation. Despite the allure of ChatGPT-based task planning in robotics, a standardized methodology remains elusive, making our work a substantial contribution. These prompts can serve as customizable templates, offering practical resources for the robotics research community. Our prompts and source code are open source and publicly available at <uri>https://github.com/microsoft/ChatGPT-Robot-Manipulation-Prompts</uri>.
ISSN:2169-3536