Collapsed variational inference for computational linguistics

<p>Bayesian modelling is a natural fit for tasks in computational linguistics, since it can provide interpretable structures, useful prior controls, and coherent management of uncertainty. However, exact Bayesian inference is intractable for many models of practical interest. Developing bot...

Cijeli opis

Bibliografski detalji
Glavni autor:	Wang, P
Daljnji autori:	Blunsom, P
Format:	Disertacija
Izdano:	2016

_version_	1826315697971527680
author	Wang, P
author2	Blunsom, P
author_facet	Blunsom, P Wang, P
author_sort	Wang, P
collection	OXFORD
description	<p>Bayesian modelling is a natural fit for tasks in computational linguistics, since it can provide interpretable structures, useful prior controls, and coherent management of uncertainty. However, exact Bayesian inference is intractable for many models of practical interest. Developing both accurate and efficient approximate Bayesian inference algorithms remains a fundamental challenge, especially for the field of computational linguistics where datasets are large and growing and models consist of complex latent structures.</p> <p>Collapsed variational inference (CVI) is an important milestone that combines the efficiency of variational inference (VI) and the accuracy of Markov chain Monte Carlo (MCMC) (Teh et al., 2006). However, its previous applications were limited to bag-of-words models whose hidden variables are conditionally independent given the parameters, whereas in computational linguistics, the hidden variable dependencies are crucial for modelling the underlying syntactic and semantic relations. To enlarge the application domain of CVI as well as to address the above Bayesian inference challenge, we investigate the applications of collapsed variational inference to computational linguistics.</p> <p>In this thesis, our contributions are three-fold. First, we solve a number of inference challenges arising from the hidden variable dependencies and derive a set of new CVI algorithms for the two ubiquitous and foundational models in computational linguistics, namely hidden Markov models (HMMs) and probabilistic context free grammars. We also propose CVI for hierarchical Dirichlet process (HDP) HMMs that are Bayesian nonparametric extensions of HMMs.</p> <p>Second, along the way we propose a set of novel algorithmic techniques, which are generally applicable to a wide variety of probabilistic graphical models in the conjugate exponential family and computational linguistic models using non-conjugate HDP constructions. Therefore, our work represents one step in bridging the gap between increasingly richer Bayesian models in computational linguistics and recent advances in approximate Bayesian inference.</p> <p>Third, we empirically evaluate our proposed CVI algorithms and their stochastic versions in a range of computational linguistic tasks, such as part-of-speech induction, grammar induction and many others. Experimental results consistently demonstrate that, using our techniques for handling the hidden variable dependencies, the empirical advantages of both VI and MCMC can be combined in a much larger domain of CVI applications.</p>
first_indexed	2024-03-06T19:01:41Z
format	Thesis
id	oxford-uuid:13c08f60-1441-4ea5-b52f-7ffd0d7a744f
institution	University of Oxford
last_indexed	2024-12-09T03:30:49Z
publishDate	2016
record_format	dspace
spelling	oxford-uuid:13c08f60-1441-4ea5-b52f-7ffd0d7a744f2024-12-01T14:10:07ZCollapsed variational inference for computational linguisticsThesishttp://purl.org/coar/resource_type/c_db06uuid:13c08f60-1441-4ea5-b52f-7ffd0d7a744fORA Deposit2016Wang, PBlunsom, P<p>Bayesian modelling is a natural fit for tasks in computational linguistics, since it can provide interpretable structures, useful prior controls, and coherent management of uncertainty. However, exact Bayesian inference is intractable for many models of practical interest. Developing both accurate and efficient approximate Bayesian inference algorithms remains a fundamental challenge, especially for the field of computational linguistics where datasets are large and growing and models consist of complex latent structures.</p> <p>Collapsed variational inference (CVI) is an important milestone that combines the efficiency of variational inference (VI) and the accuracy of Markov chain Monte Carlo (MCMC) (Teh et al., 2006). However, its previous applications were limited to bag-of-words models whose hidden variables are conditionally independent given the parameters, whereas in computational linguistics, the hidden variable dependencies are crucial for modelling the underlying syntactic and semantic relations. To enlarge the application domain of CVI as well as to address the above Bayesian inference challenge, we investigate the applications of collapsed variational inference to computational linguistics.</p> <p>In this thesis, our contributions are three-fold. First, we solve a number of inference challenges arising from the hidden variable dependencies and derive a set of new CVI algorithms for the two ubiquitous and foundational models in computational linguistics, namely hidden Markov models (HMMs) and probabilistic context free grammars. We also propose CVI for hierarchical Dirichlet process (HDP) HMMs that are Bayesian nonparametric extensions of HMMs.</p> <p>Second, along the way we propose a set of novel algorithmic techniques, which are generally applicable to a wide variety of probabilistic graphical models in the conjugate exponential family and computational linguistic models using non-conjugate HDP constructions. Therefore, our work represents one step in bridging the gap between increasingly richer Bayesian models in computational linguistics and recent advances in approximate Bayesian inference.</p> <p>Third, we empirically evaluate our proposed CVI algorithms and their stochastic versions in a range of computational linguistic tasks, such as part-of-speech induction, grammar induction and many others. Experimental results consistently demonstrate that, using our techniques for handling the hidden variable dependencies, the empirical advantages of both VI and MCMC can be combined in a much larger domain of CVI applications.</p>
spellingShingle	Wang, P Collapsed variational inference for computational linguistics
title	Collapsed variational inference for computational linguistics
title_full	Collapsed variational inference for computational linguistics
title_fullStr	Collapsed variational inference for computational linguistics
title_full_unstemmed	Collapsed variational inference for computational linguistics
title_short	Collapsed variational inference for computational linguistics
title_sort	collapsed variational inference for computational linguistics
work_keys_str_mv	AT wangp collapsedvariationalinferenceforcomputationallinguistics

Collapsed variational inference for computational linguistics

Slični predmeti