Collapsed variational inference for computational linguistics

<p>Bayesian modelling is a natural fit for tasks in computational linguistics, since it can provide interpretable structures, useful prior controls, and coherent management of uncertainty. However, exact Bayesian inference is intractable for many models of practical interest. Developing bot...

Cijeli opis

Bibliografski detalji
Glavni autor: Wang, P
Daljnji autori: Blunsom, P
Format: Disertacija
Izdano: 2016
_version_ 1826315697971527680
author Wang, P
author2 Blunsom, P
author_facet Blunsom, P
Wang, P
author_sort Wang, P
collection OXFORD
description <p>Bayesian modelling is a natural fit for tasks in computational linguistics, since it can provide interpretable structures, useful prior controls, and coherent management of uncertainty. However, exact Bayesian inference is intractable for many models of practical interest. Developing both accurate and efficient approximate Bayesian inference algorithms remains a fundamental challenge, especially for the field of computational linguistics where datasets are large and growing and models consist of complex latent structures.</p> <p>Collapsed variational inference (CVI) is an important milestone that combines the efficiency of variational inference (VI) and the accuracy of Markov chain Monte Carlo (MCMC) (Teh et al., 2006). However, its previous applications were limited to bag-of-words models whose hidden variables are conditionally independent given the parameters, whereas in computational linguistics, the hidden variable dependencies are crucial for modelling the underlying syntactic and semantic relations. To enlarge the application domain of CVI as well as to address the above Bayesian inference challenge, we investigate the applications of collapsed variational inference to computational linguistics.</p> <p>In this thesis, our contributions are three-fold. First, we solve a number of inference challenges arising from the hidden variable dependencies and derive a set of new CVI algorithms for the two ubiquitous and foundational models in computational linguistics, namely hidden Markov models (HMMs) and probabilistic context free grammars. We also propose CVI for hierarchical Dirichlet process (HDP) HMMs that are Bayesian nonparametric extensions of HMMs.</p> <p>Second, along the way we propose a set of novel algorithmic techniques, which are generally applicable to a wide variety of probabilistic graphical models in the conjugate exponential family and computational linguistic models using non-conjugate HDP constructions. Therefore, our work represents one step in bridging the gap between increasingly richer Bayesian models in computational linguistics and recent advances in approximate Bayesian inference.</p> <p>Third, we empirically evaluate our proposed CVI algorithms and their stochastic versions in a range of computational linguistic tasks, such as part-of-speech induction, grammar induction and many others. Experimental results consistently demonstrate that, using our techniques for handling the hidden variable dependencies, the empirical advantages of both VI and MCMC can be combined in a much larger domain of CVI applications.</p>
first_indexed 2024-03-06T19:01:41Z
format Thesis
id oxford-uuid:13c08f60-1441-4ea5-b52f-7ffd0d7a744f
institution University of Oxford
last_indexed 2024-12-09T03:30:49Z
publishDate 2016
record_format dspace
spelling oxford-uuid:13c08f60-1441-4ea5-b52f-7ffd0d7a744f2024-12-01T14:10:07ZCollapsed variational inference for computational linguisticsThesishttp://purl.org/coar/resource_type/c_db06uuid:13c08f60-1441-4ea5-b52f-7ffd0d7a744fORA Deposit2016Wang, PBlunsom, P<p>Bayesian modelling is a natural fit for tasks in computational linguistics, since it can provide interpretable structures, useful prior controls, and coherent management of uncertainty. However, exact Bayesian inference is intractable for many models of practical interest. Developing both accurate and efficient approximate Bayesian inference algorithms remains a fundamental challenge, especially for the field of computational linguistics where datasets are large and growing and models consist of complex latent structures.</p> <p>Collapsed variational inference (CVI) is an important milestone that combines the efficiency of variational inference (VI) and the accuracy of Markov chain Monte Carlo (MCMC) (Teh et al., 2006). However, its previous applications were limited to bag-of-words models whose hidden variables are conditionally independent given the parameters, whereas in computational linguistics, the hidden variable dependencies are crucial for modelling the underlying syntactic and semantic relations. To enlarge the application domain of CVI as well as to address the above Bayesian inference challenge, we investigate the applications of collapsed variational inference to computational linguistics.</p> <p>In this thesis, our contributions are three-fold. First, we solve a number of inference challenges arising from the hidden variable dependencies and derive a set of new CVI algorithms for the two ubiquitous and foundational models in computational linguistics, namely hidden Markov models (HMMs) and probabilistic context free grammars. We also propose CVI for hierarchical Dirichlet process (HDP) HMMs that are Bayesian nonparametric extensions of HMMs.</p> <p>Second, along the way we propose a set of novel algorithmic techniques, which are generally applicable to a wide variety of probabilistic graphical models in the conjugate exponential family and computational linguistic models using non-conjugate HDP constructions. Therefore, our work represents one step in bridging the gap between increasingly richer Bayesian models in computational linguistics and recent advances in approximate Bayesian inference.</p> <p>Third, we empirically evaluate our proposed CVI algorithms and their stochastic versions in a range of computational linguistic tasks, such as part-of-speech induction, grammar induction and many others. Experimental results consistently demonstrate that, using our techniques for handling the hidden variable dependencies, the empirical advantages of both VI and MCMC can be combined in a much larger domain of CVI applications.</p>
spellingShingle Wang, P
Collapsed variational inference for computational linguistics
title Collapsed variational inference for computational linguistics
title_full Collapsed variational inference for computational linguistics
title_fullStr Collapsed variational inference for computational linguistics
title_full_unstemmed Collapsed variational inference for computational linguistics
title_short Collapsed variational inference for computational linguistics
title_sort collapsed variational inference for computational linguistics
work_keys_str_mv AT wangp collapsedvariationalinferenceforcomputationallinguistics