A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation

In this paper we present a doubly hierarchical Pitman-Yor process language model. Its bottom layer of hierarchy consists of multiple hierarchical Pitman-Yor process language models, one each for some number of domains. The novel top layer of hierarchy consists of a mechanism to couple together multi...

Full description

Bibliographic Details
Main Authors: Wood, F, Teh, Y
Format: Journal article
Language:English
Published: 2009
_version_ 1826273004296863744
author Wood, F
Teh, Y
author_facet Wood, F
Teh, Y
author_sort Wood, F
collection OXFORD
description In this paper we present a doubly hierarchical Pitman-Yor process language model. Its bottom layer of hierarchy consists of multiple hierarchical Pitman-Yor process language models, one each for some number of domains. The novel top layer of hierarchy consists of a mechanism to couple together multiple language models such that they share statistical strength. Intuitively this sharing results in the "adaptation" of a latent shared language model to each domain. We introduce a general formalism capable of describing the overallmodel which we call the graphical Pitman-Yor process and explain how to perform Bayesian inference in it. We present encouraging language model domain adaptation results that both illustrate the potential benefits of our new model and suggest new avenues of inquiry. © 2009 by the authors.
first_indexed 2024-03-06T22:21:32Z
format Journal article
id oxford-uuid:55389ff4-8b5b-408c-a4bf-267319c665c7
institution University of Oxford
language English
last_indexed 2024-03-06T22:21:32Z
publishDate 2009
record_format dspace
spelling oxford-uuid:55389ff4-8b5b-408c-a4bf-267319c665c72022-03-26T16:42:42ZA hierarchical nonparametric Bayesian approach to statistical language model domain adaptationJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:55389ff4-8b5b-408c-a4bf-267319c665c7EnglishSymplectic Elements at Oxford2009Wood, FTeh, YIn this paper we present a doubly hierarchical Pitman-Yor process language model. Its bottom layer of hierarchy consists of multiple hierarchical Pitman-Yor process language models, one each for some number of domains. The novel top layer of hierarchy consists of a mechanism to couple together multiple language models such that they share statistical strength. Intuitively this sharing results in the "adaptation" of a latent shared language model to each domain. We introduce a general formalism capable of describing the overallmodel which we call the graphical Pitman-Yor process and explain how to perform Bayesian inference in it. We present encouraging language model domain adaptation results that both illustrate the potential benefits of our new model and suggest new avenues of inquiry. © 2009 by the authors.
spellingShingle Wood, F
Teh, Y
A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation
title A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation
title_full A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation
title_fullStr A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation
title_full_unstemmed A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation
title_short A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation
title_sort hierarchical nonparametric bayesian approach to statistical language model domain adaptation
work_keys_str_mv AT woodf ahierarchicalnonparametricbayesianapproachtostatisticallanguagemodeldomainadaptation
AT tehy ahierarchicalnonparametricbayesianapproachtostatisticallanguagemodeldomainadaptation
AT woodf hierarchicalnonparametricbayesianapproachtostatisticallanguagemodeldomainadaptation
AT tehy hierarchicalnonparametricbayesianapproachtostatisticallanguagemodeldomainadaptation