Learning semantic maps from natural language

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.

Bibliographic Details
Main Author: Hemachandra, Sachithra Madhawa
Other Authors: Nicholas Roy.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2015
Subjects:
Online Access:http://hdl.handle.net/1721.1/97757
_version_ 1826191817826107392
author Hemachandra, Sachithra Madhawa
author2 Nicholas Roy.
author_facet Nicholas Roy.
Hemachandra, Sachithra Madhawa
author_sort Hemachandra, Sachithra Madhawa
collection MIT
description Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
first_indexed 2024-09-23T09:01:45Z
format Thesis
id mit-1721.1/97757
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T09:01:45Z
publishDate 2015
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/977572019-04-10T20:17:52Z Learning semantic maps from natural language Hemachandra, Sachithra Madhawa Nicholas Roy. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from PDF student-submitted version of thesis. Includes bibliographical references (pages 185-193). As robots move into human-occupied environments, the need for effective mechanisms to enable interactions with humans becomes vital. Natural language is a flexible, intuitive medium that can enable such interactions, but language understanding requires robots to learn representations of their environments that are compatible with the conceptual models used by people. Current approaches to constructing such spatial-semantic representations rely solely on traditional sensors to acquire knowledge of the environment, which restricts robots to learning limited knowledge of their local surround. Furthermore, they can only reason over the limited portion of the environment that is in the robot's field-of-view. Natural language, on the other hand, allows people to share rich properties of their environment with their robotic partners in a flexible, efficient manner. The ability to integrate such descriptions can allow the robot to learn semantic properties such as colloquial names that are difficult to infer using existing methods, and learn about the world outside its perception range. The spatial and temporal disconnect between language descriptions and the robot's onboard sensors makes fusing the two sources of information challenging. This thesis addresses the problem of fusing information contained in natural language descriptions with the robot's onboard sensors to construct spatial-semantic representations useful for interacting with humans. The novelty lies in treating natural language descriptions as another sensor observation that informs the robot about its environment. Towards this end, we introduce the semantic graph, a spatial-semantic representation that provides a common framework in which we integrate information that the user communicates (e.g., labels and spatial relations) with observations from the robot's sensors. Our algorithm efficiently maintains a factored distribution over semantic graphs based upon the stream of natural language and low-level sensor information. We detail the means by which the framework incorporates knowledge conveyed by the user's descriptions, including the ability to reason over expressions that reference yet unknown regions in the environment. We evaluate the algorithm's ability to learn human-centric maps of several different environments and analyze the knowledge inferred from language and the utility of the learned maps. The results demonstrate that the incorporation of information from free-form descriptions increases the metric, topological and semantic accuracy of the recovered environment model. Next, we outline an algorithm that enables robots to improve their spatial-semantic representation of an environment by engaging users in dialog. The algorithm reasons over the ambiguity of language descriptions provided by the user given the current map, and selects information-gathering actions in the form of targeted questions about its local surroundings and areas distant from the robot. Our algorithm balances the information-theoretic value of candidate questions with a measure of cost associated with dialog. We demonstrate that by asking deliberate questions of the user, the method significantly improves the accuracy of the learned semantic map. Finally, we introduce a learning framework that enables robots to successfully follow natural language navigation instructions within previously unknown environments. The algorithm utilizes information about the environment that the human conveys within the command to learn a distribution over the spatial-semantic model of the environment. We achieve this through a formulation of our semantic mapping algorithm that uses information conveyed in the command to directly reason over unobserved spatial structure. The framework then uses this distribution in place of the latent world model to interpret the natural language instruction as a distribution over the intended actions. Next, a belief space planner solves for the action that best satisfies the intent of the command. We apply this towards following directions to objects and natural language route directions in unknown environments. We evaluate this approach through simulation and physical experiments, and demonstrate its ability to follow navigation commands with performance comparable to that of a fully-known environment. by Sachithra Madhawa Hemachandra. Ph. D. 2015-07-17T19:12:02Z 2015-07-17T19:12:02Z 2015 2015 Thesis http://hdl.handle.net/1721.1/97757 912290844 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 193 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Hemachandra, Sachithra Madhawa
Learning semantic maps from natural language
title Learning semantic maps from natural language
title_full Learning semantic maps from natural language
title_fullStr Learning semantic maps from natural language
title_full_unstemmed Learning semantic maps from natural language
title_short Learning semantic maps from natural language
title_sort learning semantic maps from natural language
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/97757
work_keys_str_mv AT hemachandrasachithramadhawa learningsemanticmapsfromnaturallanguage