Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference.

An important feature of Bayesian statistics is the opportunity to do sequential inference: the posterior distribution obtained after seeing a dataset can be used as prior for a second inference. However, when Monte Carlo sampling methods are used for inference, we only have a set of samples from the...

Full description

Bibliographic Details
Main Authors: Bram Thijssen, Lodewyk F A Wessels
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0230101
_version_ 1818729984164888576
author Bram Thijssen
Lodewyk F A Wessels
author_facet Bram Thijssen
Lodewyk F A Wessels
author_sort Bram Thijssen
collection DOAJ
description An important feature of Bayesian statistics is the opportunity to do sequential inference: the posterior distribution obtained after seeing a dataset can be used as prior for a second inference. However, when Monte Carlo sampling methods are used for inference, we only have a set of samples from the posterior distribution. To do sequential inference, we then either have to evaluate the second posterior at only these locations and reweight the samples accordingly, or we can estimate a functional description of the posterior probability distribution from the samples and use that as prior for the second inference. Here, we investigated to what extent we can obtain an accurate joint posterior from two datasets if the inference is done sequentially rather than jointly, under the condition that each inference step is done using Monte Carlo sampling. To test this, we evaluated the accuracy of kernel density estimates, Gaussian mixtures, mixtures of factor analyzers, vine copulas and Gaussian processes in approximating posterior distributions, and then tested whether these approximations can be used in sequential inference. In low dimensionality, Gaussian processes are more accurate, whereas in higher dimensionality Gaussian mixtures, mixtures of factor analyzers or vine copulas perform better. In our test cases of sequential inference, using posterior approximations gives more accurate results than direct sample reweighting, but joint inference is still preferable over sequential inference whenever possible. Since the performance is case-specific, we provide an R package mvdens with a unified interface for the density approximation methods.
first_indexed 2024-12-17T22:54:34Z
format Article
id doaj.art-1aa612b8d5d94374a6b4d304812c6986
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-17T22:54:34Z
publishDate 2020-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-1aa612b8d5d94374a6b4d304812c69862022-12-21T21:29:35ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01153e023010110.1371/journal.pone.0230101Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference.Bram ThijssenLodewyk F A WesselsAn important feature of Bayesian statistics is the opportunity to do sequential inference: the posterior distribution obtained after seeing a dataset can be used as prior for a second inference. However, when Monte Carlo sampling methods are used for inference, we only have a set of samples from the posterior distribution. To do sequential inference, we then either have to evaluate the second posterior at only these locations and reweight the samples accordingly, or we can estimate a functional description of the posterior probability distribution from the samples and use that as prior for the second inference. Here, we investigated to what extent we can obtain an accurate joint posterior from two datasets if the inference is done sequentially rather than jointly, under the condition that each inference step is done using Monte Carlo sampling. To test this, we evaluated the accuracy of kernel density estimates, Gaussian mixtures, mixtures of factor analyzers, vine copulas and Gaussian processes in approximating posterior distributions, and then tested whether these approximations can be used in sequential inference. In low dimensionality, Gaussian processes are more accurate, whereas in higher dimensionality Gaussian mixtures, mixtures of factor analyzers or vine copulas perform better. In our test cases of sequential inference, using posterior approximations gives more accurate results than direct sample reweighting, but joint inference is still preferable over sequential inference whenever possible. Since the performance is case-specific, we provide an R package mvdens with a unified interface for the density approximation methods.https://doi.org/10.1371/journal.pone.0230101
spellingShingle Bram Thijssen
Lodewyk F A Wessels
Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference.
PLoS ONE
title Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference.
title_full Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference.
title_fullStr Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference.
title_full_unstemmed Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference.
title_short Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference.
title_sort approximating multivariate posterior distribution functions from monte carlo samples for sequential bayesian inference
url https://doi.org/10.1371/journal.pone.0230101
work_keys_str_mv AT bramthijssen approximatingmultivariateposteriordistributionfunctionsfrommontecarlosamplesforsequentialbayesianinference
AT lodewykfawessels approximatingmultivariateposteriordistributionfunctionsfrommontecarlosamplesforsequentialbayesianinference