Fast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothing

<p>Probabilistic programming is an innovative programming paradigm for posing and automatically solving Bayesian inference problems. In this thesis, we study the foundations of fast yet correct inference for probabilistic programming.</p> <p>Many of the most successful inference t...

Descripció completa

Dades bibliogràfiques
Autor principal:	Wagner, D
Altres autors:	Ong, L
Format:	Thesis
Idioma:	English
Publicat:	2023
Matèries:	Variational Inference Probabilistic Programming

_version_	1826312770847506432
author	Wagner, D
author2	Ong, L
author_facet	Ong, L Wagner, D
author_sort	Wagner, D
collection	OXFORD
description	<p>Probabilistic programming is an innovative programming paradigm for posing and automatically solving Bayesian inference problems. In this thesis, we study the foundations of fast yet correct inference for probabilistic programming.</p> <p>Many of the most successful inference techniques (e.g. Hamiltonian Monte Carlo or Stochastic Variational Inference) harness gradients of the so-called density function, which therefore needs to be differentiable at least almost everywhere. We resolve a question posed by Hongseok Yang by demonstrating the following: densities of almost surely terminating programs are differentiable almost everywhere.</p> <p>Having established this property necessary for the correctness of gradient-based inference algorithms, we investigate variational inference, which frames posterior inference as an optimisation problem, in more detail. The dominant approach for stochastic optimisation in practice is stochastic gradient descent. In particular, a variant using the so-called reparameterisation gradient estimator exhibits low variance, resulting in fast convergence in a traditional statistics setting. Unfortunately, although having measure 0, discontinuities can compromise the correctness of this approach.</p> <p>Therefore, we propose a smoothed interpretation parameterised by an accuracy coefficient and present type systems establishing technical pre-conditions. Thus, we can prove stochastic gradient descent with the reparameterisation gradient estimator to be correct when applied to the smoothed problem. Besides, via a uniform convergence result, we can solve the original problem up to any error tolerance by choosing an accuracy coefficient suitably.</p> <p>Furthermore, rather than fixing an accuracy coefficient in advance (limiting the quality of the final solution), we propose a novel variant of stochastic gradient descent, Diagonalisation Stochastic Gradient Descent, which progressively enhances the accuracy of the smoothed approximation during optimisation, and we prove convergence to stationary points of the unsmoothed (original) objective.</p> <p>An empirical evaluation reveals benefits of our approaches over the state of the art: our approaches are simple, fast and attain orders of magnitude reduction in work- normalised variance. Besides, Diagonalisation Stochastic Gradient Descent is more stable than standard stochastic gradient descent for a fixed-accuracy smoothing.</p> <p>Finally, we show unbiasedness of the reparameterisation gradient estimator for continuous but non-differentiable models, and we propose a method based on higher-order logic to establish continuity in the presence of conditionals. We provide a sound and complete reduction for verifying continuity of models to a satisfiability problem, and we propose novel efficient randomised decision procedures.</p>
first_indexed	2024-04-23T08:25:42Z
format	Thesis
id	oxford-uuid:76fbdc5e-37af-4254-a41b-25c67f66fdb0
institution	University of Oxford
language	English
last_indexed	2024-04-23T08:25:42Z
publishDate	2023
record_format	dspace
spelling	oxford-uuid:76fbdc5e-37af-4254-a41b-25c67f66fdb02024-04-16T16:17:15ZFast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothingThesishttp://purl.org/coar/resource_type/c_db06uuid:76fbdc5e-37af-4254-a41b-25c67f66fdb0Variational InferenceProbabilistic ProgrammingEnglishHyrax Deposit2023Wagner, DOng, L<p>Probabilistic programming is an innovative programming paradigm for posing and automatically solving Bayesian inference problems. In this thesis, we study the foundations of fast yet correct inference for probabilistic programming.</p> <p>Many of the most successful inference techniques (e.g. Hamiltonian Monte Carlo or Stochastic Variational Inference) harness gradients of the so-called density function, which therefore needs to be differentiable at least almost everywhere. We resolve a question posed by Hongseok Yang by demonstrating the following: densities of almost surely terminating programs are differentiable almost everywhere.</p> <p>Having established this property necessary for the correctness of gradient-based inference algorithms, we investigate variational inference, which frames posterior inference as an optimisation problem, in more detail. The dominant approach for stochastic optimisation in practice is stochastic gradient descent. In particular, a variant using the so-called reparameterisation gradient estimator exhibits low variance, resulting in fast convergence in a traditional statistics setting. Unfortunately, although having measure 0, discontinuities can compromise the correctness of this approach.</p> <p>Therefore, we propose a smoothed interpretation parameterised by an accuracy coefficient and present type systems establishing technical pre-conditions. Thus, we can prove stochastic gradient descent with the reparameterisation gradient estimator to be correct when applied to the smoothed problem. Besides, via a uniform convergence result, we can solve the original problem up to any error tolerance by choosing an accuracy coefficient suitably.</p> <p>Furthermore, rather than fixing an accuracy coefficient in advance (limiting the quality of the final solution), we propose a novel variant of stochastic gradient descent, Diagonalisation Stochastic Gradient Descent, which progressively enhances the accuracy of the smoothed approximation during optimisation, and we prove convergence to stationary points of the unsmoothed (original) objective.</p> <p>An empirical evaluation reveals benefits of our approaches over the state of the art: our approaches are simple, fast and attain orders of magnitude reduction in work- normalised variance. Besides, Diagonalisation Stochastic Gradient Descent is more stable than standard stochastic gradient descent for a fixed-accuracy smoothing.</p> <p>Finally, we show unbiasedness of the reparameterisation gradient estimator for continuous but non-differentiable models, and we propose a method based on higher-order logic to establish continuity in the presence of conditionals. We provide a sound and complete reduction for verifying continuity of models to a satisfiability problem, and we propose novel efficient randomised decision procedures.</p>
spellingShingle	Variational Inference Probabilistic Programming Wagner, D Fast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothing
title	Fast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothing
title_full	Fast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothing
title_fullStr	Fast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothing
title_full_unstemmed	Fast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothing
title_short	Fast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothing
title_sort	fast and correct variational inference for probabilistic programming differentiability reparameterisation and smoothing
topic	Variational Inference Probabilistic Programming
work_keys_str_mv	AT wagnerd fastandcorrectvariationalinferenceforprobabilisticprogrammingdifferentiabilityreparameterisationandsmoothing

Fast and correct variational inference for probabilistic programming: Differentiability, reparameterisation and smoothing

Ítems similars