Grounding Verbs of Motion in Natural Language Commands to Robots

To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. An important class of instructions involve interacting with people, such as “Follow the person to the kitchen” or “Meet the person at the elevators.” These instructions require that...

Full description

Bibliographic Details
Main Authors: Kollar, Thomas Fleming, Tellex, Stefanie A, Roy, Deb K, Roy, Nicholas
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Published: Springer Nature 2018
Online Access:http://hdl.handle.net/1721.1/114657
https://orcid.org/0000-0002-4333-7194
https://orcid.org/0000-0002-8293-0492
_version_ 1826197429007941632
author Kollar, Thomas Fleming
Tellex, Stefanie A
Roy, Deb K
Roy, Nicholas
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Kollar, Thomas Fleming
Tellex, Stefanie A
Roy, Deb K
Roy, Nicholas
author_sort Kollar, Thomas Fleming
collection MIT
description To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. An important class of instructions involve interacting with people, such as “Follow the person to the kitchen” or “Meet the person at the elevators.” These instructions require that the robot fluidly react to changes in the environment, not simply follow a pre-computed plan. We present an algorithm for understanding natural language commands with three components. First, we create a cost function that scores the language according to how well it matches a candidate plan in the environment, defined as the log-likelihood of the plan given the command. Components of the cost function include novel models for the meanings of motion verbs such as “follow,” “meet,” and “avoid,” as well as spatial relations such as “to” and landmark phrases such as “the kitchen.” Second, an inference method uses this cost function to perform forward search, finding a plan that matches the natural language command. Third, a high-level controller repeatedly calls the inference method at each timestep to compute a new plan in response to changes in the environment such as the movement of the human partner or other people in the scene. When a command consists of more than a single task, the controller switches to the next task when an earlier one is satisfied. We evaluate our approach on a set of example tasks that require the ability to follow both simple and complex natural language commands. Keywords: Cost Function; Spatial Relation; State Sequence; Edit Distance; Statistical Machine Translation
first_indexed 2024-09-23T10:47:31Z
format Article
id mit-1721.1/114657
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T10:47:31Z
publishDate 2018
publisher Springer Nature
record_format dspace
spelling mit-1721.1/1146572022-09-27T15:02:35Z Grounding Verbs of Motion in Natural Language Commands to Robots Kollar, Thomas Fleming Tellex, Stefanie A Roy, Deb K Roy, Nicholas Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Kollar, Thomas Fleming Tellex, Stefanie A Roy, Deb K Roy, Nicholas To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. An important class of instructions involve interacting with people, such as “Follow the person to the kitchen” or “Meet the person at the elevators.” These instructions require that the robot fluidly react to changes in the environment, not simply follow a pre-computed plan. We present an algorithm for understanding natural language commands with three components. First, we create a cost function that scores the language according to how well it matches a candidate plan in the environment, defined as the log-likelihood of the plan given the command. Components of the cost function include novel models for the meanings of motion verbs such as “follow,” “meet,” and “avoid,” as well as spatial relations such as “to” and landmark phrases such as “the kitchen.” Second, an inference method uses this cost function to perform forward search, finding a plan that matches the natural language command. Third, a high-level controller repeatedly calls the inference method at each timestep to compute a new plan in response to changes in the environment such as the movement of the human partner or other people in the scene. When a command consists of more than a single task, the controller switches to the next task when an earlier one is satisfied. We evaluate our approach on a set of example tasks that require the ability to follow both simple and complex natural language commands. Keywords: Cost Function; Spatial Relation; State Sequence; Edit Distance; Statistical Machine Translation United States. Office of Naval Research (Grant MURI N00014-07-1-0749) 2018-04-11T15:17:57Z 2018-04-11T15:17:57Z 2014 2018-04-10T14:50:47Z Article http://purl.org/eprint/type/ConferencePaper 978-3-642-28571-4 978-3-642-28572-1 1610-7438 1610-742X http://hdl.handle.net/1721.1/114657 Kollar, Thomas et al. “Grounding Verbs of Motion in Natural Language Commands to Robots.” edited by O. Khatib, V. Kumar and G. Sukhatme. Experimental Robotics (2014): 31–47 © 2014 Springer-Verlag Berlin Heidelberg https://orcid.org/0000-0002-4333-7194 https://orcid.org/0000-0002-8293-0492 http://dx.doi.org/10.1007/978-3-642-28572-1_3 Experimental Robotics Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Springer Nature Other repository
spellingShingle Kollar, Thomas Fleming
Tellex, Stefanie A
Roy, Deb K
Roy, Nicholas
Grounding Verbs of Motion in Natural Language Commands to Robots
title Grounding Verbs of Motion in Natural Language Commands to Robots
title_full Grounding Verbs of Motion in Natural Language Commands to Robots
title_fullStr Grounding Verbs of Motion in Natural Language Commands to Robots
title_full_unstemmed Grounding Verbs of Motion in Natural Language Commands to Robots
title_short Grounding Verbs of Motion in Natural Language Commands to Robots
title_sort grounding verbs of motion in natural language commands to robots
url http://hdl.handle.net/1721.1/114657
https://orcid.org/0000-0002-4333-7194
https://orcid.org/0000-0002-8293-0492
work_keys_str_mv AT kollarthomasfleming groundingverbsofmotioninnaturallanguagecommandstorobots
AT tellexstefaniea groundingverbsofmotioninnaturallanguagecommandstorobots
AT roydebk groundingverbsofmotioninnaturallanguagecommandstorobots
AT roynicholas groundingverbsofmotioninnaturallanguagecommandstorobots