Grounding language in events

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.

Bibliographic Details
Main Author:	Fleischman, Michael Ben
Other Authors:	Deb Roy.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2009
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/46548

_version_	1826196897526710272
author	Fleischman, Michael Ben
author2	Deb Roy.
author_facet	Deb Roy. Fleischman, Michael Ben
author_sort	Fleischman, Michael Ben
collection	MIT
description	Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.
first_indexed	2024-09-23T10:39:41Z
format	Thesis
id	mit-1721.1/46548
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T10:39:41Z
publishDate	2009
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/465482019-04-10T15:17:21Z Grounding language in events Fleischman, Michael Ben Deb Roy. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008. Includes bibliographical references (p. 137-142). Broadcast video and virtual environments are just two of the growing number of domains in which language is embedded in multiple modalities of rich non-linguistic information. Applications for such multimodal domains are often based on traditional natural language processing techniques that ignore the connection between words and the non-linguistic context in which they are used. This thesis describes a methodology for representing these connections in models which ground the meaning of words in representations of events. Incorporating these grounded language models with text-based techniques significantly improves the performance of three multimodal applications: natural language understanding in videogames, sports video search and automatic speech recognition. Two approaches to representing the structure of events are presented and used to model the meaning of words. In the domain of virtual game worlds, a hand-designed hierarchical behavior grammar is used to explicitly represent all the various actions that an agent can take in a virtual world. This grammar is used to interpret events by parsing sequences of observed actions in order to generate hierarchical event structures. In the noisier and more open -ended domain of broadcast sports video, hierarchical temporal patterns are automatically mined from large corpora of unlabeled video data. The structure of events in video is represented by vectors of these hierarchical patterns. (cont.) Grounded language models are encoded using Hierarchical Bayesian models to represent the probability of words given elements of these event structures. These grounded language models are used to incorporate non-linguistic information into text-based approaches to multimodal applications. In the virtual game domain, this non-linguistic information improves natural language understanding for a virtual agent by nearly 10% and cuts in half the negative effects of noise caused by automatic speech recognition. For broadcast video of baseball and American football, video search systems that incorporate grounded language models are shown to perform up to 33% better than text-based systems. Further, systems for recognizing speech in baseball video that use grounded language models show 25% greater word accuracy than traditional systems. by Michael Ben Fleischman. Ph.D. 2009-08-26T16:48:27Z 2009-08-26T16:48:27Z 2008 2008 Thesis http://hdl.handle.net/1721.1/46548 418279066 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 142 p. application/pdf Massachusetts Institute of Technology
spellingShingle	Electrical Engineering and Computer Science. Fleischman, Michael Ben Grounding language in events
title	Grounding language in events
title_full	Grounding language in events
title_fullStr	Grounding language in events
title_full_unstemmed	Grounding language in events
title_short	Grounding language in events
title_sort	grounding language in events
topic	Electrical Engineering and Computer Science.
url	http://hdl.handle.net/1721.1/46548
work_keys_str_mv	AT fleischmanmichaelben groundinglanguageinevents

Grounding language in events

Similar Items