Grounding language models in spatiotemporal context

Natural language is rich and varied, but also highly structured. The rules of grammar are a primary source of linguistic regularity, but there are many other factors that govern patterns of language use. Language models attempt to capture linguistic regularities, typically by modeling the statistics...

Full description

Bibliographic Details
Main Authors: Roy, Brandon C., Vosoughi, Soroush, Roy, Deb K
Other Authors: Program in Media Arts and Sciences (Massachusetts Institute of Technology)
Format: Article
Language:en_US
Published: International Speech Communication Association 2014
Online Access:http://hdl.handle.net/1721.1/91490
https://orcid.org/0000-0002-2564-8909
https://orcid.org/0000-0002-4333-7194
_version_ 1826208874379608064
author Roy, Brandon C.
Vosoughi, Soroush
Roy, Deb K
author2 Program in Media Arts and Sciences (Massachusetts Institute of Technology)
author_facet Program in Media Arts and Sciences (Massachusetts Institute of Technology)
Roy, Brandon C.
Vosoughi, Soroush
Roy, Deb K
author_sort Roy, Brandon C.
collection MIT
description Natural language is rich and varied, but also highly structured. The rules of grammar are a primary source of linguistic regularity, but there are many other factors that govern patterns of language use. Language models attempt to capture linguistic regularities, typically by modeling the statistics of word use, thereby folding in some aspects of grammar and style. Spoken language is an important and interesting subset of natural language that is temporally and spatially grounded. While time and space may directly contribute to a speaker’s choice of words, they may also serve as indicators for communicative intent or other contextual and situational factors. To investigate the value of spatial and temporal information, we build a series of language models using a large, naturalistic corpus of spatially and temporally coded speech collected from a home environment. We incorporate this extralinguistic information by building spatiotemporal word classifiers that are mixed with traditional unigram and bigram models. Our evaluation shows that both perplexity and word error rate can be significantly improved by incorporating this information in a simple framework. The underlying principles of this work could be applied in a wide range of scenarios in which temporal or spatial information is available.
first_indexed 2024-09-23T14:14:00Z
format Article
id mit-1721.1/91490
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T14:14:00Z
publishDate 2014
publisher International Speech Communication Association
record_format dspace
spelling mit-1721.1/914902022-10-01T19:55:49Z Grounding language models in spatiotemporal context Roy, Brandon C. Vosoughi, Soroush Roy, Deb K Program in Media Arts and Sciences (Massachusetts Institute of Technology) Vosoughi, Soroush Roy, Brandon C. Vosoughi, Soroush Roy, Deb K. Natural language is rich and varied, but also highly structured. The rules of grammar are a primary source of linguistic regularity, but there are many other factors that govern patterns of language use. Language models attempt to capture linguistic regularities, typically by modeling the statistics of word use, thereby folding in some aspects of grammar and style. Spoken language is an important and interesting subset of natural language that is temporally and spatially grounded. While time and space may directly contribute to a speaker’s choice of words, they may also serve as indicators for communicative intent or other contextual and situational factors. To investigate the value of spatial and temporal information, we build a series of language models using a large, naturalistic corpus of spatially and temporally coded speech collected from a home environment. We incorporate this extralinguistic information by building spatiotemporal word classifiers that are mixed with traditional unigram and bigram models. Our evaluation shows that both perplexity and word error rate can be significantly improved by incorporating this information in a simple framework. The underlying principles of this work could be applied in a wide range of scenarios in which temporal or spatial information is available. 2014-11-07T15:00:20Z 2014-11-07T15:00:20Z 2014-09 Article http://purl.org/eprint/type/ConferencePaper http://hdl.handle.net/1721.1/91490 Roy, Brandon C., Soroush Vosoughi, and Deb Roy. "Grounding language in spatiotemporal context." The 15th Annual Conference of the International Speech Communication Association, September 14-18, 2014. https://orcid.org/0000-0002-2564-8909 https://orcid.org/0000-0002-4333-7194 en_US http://www.interspeech2014.org/public.php?page=program_details.html Proceedings of the 15th Annual Conference of the International Speech Communication Association Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf International Speech Communication Association Vosoughi
spellingShingle Roy, Brandon C.
Vosoughi, Soroush
Roy, Deb K
Grounding language models in spatiotemporal context
title Grounding language models in spatiotemporal context
title_full Grounding language models in spatiotemporal context
title_fullStr Grounding language models in spatiotemporal context
title_full_unstemmed Grounding language models in spatiotemporal context
title_short Grounding language models in spatiotemporal context
title_sort grounding language models in spatiotemporal context
url http://hdl.handle.net/1721.1/91490
https://orcid.org/0000-0002-2564-8909
https://orcid.org/0000-0002-4333-7194
work_keys_str_mv AT roybrandonc groundinglanguagemodelsinspatiotemporalcontext
AT vosoughisoroush groundinglanguagemodelsinspatiotemporalcontext
AT roydebk groundinglanguagemodelsinspatiotemporalcontext