A Framework for LLM-based Lifelong Learning in Robot Manipulation

While robotic agents have become increasingly adept at low-level manipulation skills, increasingly they are being guided by large language model planners that decompose complex tasks into subgoals. Recent works indicate that these language models may also be effective skill learners. We develop HaLP...

Full description

Bibliographic Details
Main Author: Mao, Jerry W.
Other Authors: Agrawal, Pulkit
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/153895
Description
Summary:While robotic agents have become increasingly adept at low-level manipulation skills, increasingly they are being guided by large language model planners that decompose complex tasks into subgoals. Recent works indicate that these language models may also be effective skill learners. We develop HaLP 2.0, a modular and extensible framework for lifelong learning in human-assisted language planning, using GPT-4 to propose a curriculum of skills that is learned, used, and intelligently reused. Our system is designed for large-scale experiments, is equipped with a user-friendly interface, and is extensible to new skill learning frameworks. We demonstrate extensibility by comparing alternative implementations of our abstractions and improving overall performance by incorporating novel frameworks. Moreover, we conduct a focused study of GPT-4, using crowd-sourced scene and task datasets, finding that language models are capable agents of skill reuse and adaptation. We observe that while performance is dependent on language context, supplying optimized prompts can yield exceptional skill reuse behaviors. We envision that as manipulation primitives and large language models become more powerful, our system will be ready to synthesize their capabilities to create an autonomous system for lifelong learning, that can one day be deployed in the real world.