Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics

In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC)...

Full description

Bibliographic Details
Main Authors:	Sebastian Höhna, Michael J. Landis, John P. Huelsenbeck
Format:	Article
Language:	English
Published:	PeerJ Inc. 2021-11-01
Series:	PeerJ
Subjects:	Bayes factor Parallelization Phylogenetics
Online Access:	https://peerj.com/articles/12438.pdf

_version_	1797420921079726080
author	Sebastian Höhna Michael J. Landis John P. Huelsenbeck
author_facet	Sebastian Höhna Michael J. Landis John P. Huelsenbeck
author_sort	Sebastian Höhna
collection	DOAJ
description	In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.
first_indexed	2024-03-09T07:08:52Z
format	Article
id	doaj.art-f0d5f15a6f444068a4bc767018c0774d
institution	Directory Open Access Journal
issn	2167-8359
language	English
last_indexed	2024-03-09T07:08:52Z
publishDate	2021-11-01
publisher	PeerJ Inc.
record_format	Article
series	PeerJ
spelling	doaj.art-f0d5f15a6f444068a4bc767018c0774d2023-12-03T09:18:23ZengPeerJ Inc.PeerJ2167-83592021-11-019e1243810.7717/peerj.12438Parallel power posterior analyses for fast computation of marginal likelihoods in phylogeneticsSebastian Höhna0Michael J. Landis1John P. Huelsenbeck2GeoBio-Center, Ludwig-Maximilians-Universität München, Munich, GermanyDepartment of Biology, Washington University in St. Louis, St. Louis, United States of AmericaDepartment of Integrative Biology, University of California,, Berkeley, United States of AmericaIn Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.https://peerj.com/articles/12438.pdfBayes factorParallelizationPhylogenetics
spellingShingle	Sebastian Höhna Michael J. Landis John P. Huelsenbeck Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics PeerJ Bayes factor Parallelization Phylogenetics
title	Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_full	Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_fullStr	Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_full_unstemmed	Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_short	Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_sort	parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
topic	Bayes factor Parallelization Phylogenetics
url	https://peerj.com/articles/12438.pdf
work_keys_str_mv	AT sebastianhohna parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics AT michaeljlandis parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics AT johnphuelsenbeck parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics

Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics

Similar Items