Improving the performance of Bayesian phylogenetic inference under relaxed clock models

Abstract Background Bayesian MCMC has become a common approach for phylogenetic inference. But the growing size of molecular sequence data sets has created a pressing need to improve the computational efficiency of Bayesian phylogenetic inference algorithms. Results This paper develops a new algorit...

Full description

Bibliographic Details
Main Authors: Rong Zhang, Alexei Drummond
Format: Article
Language:English
Published: BMC 2020-05-01
Series:BMC Evolutionary Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12862-020-01609-4
_version_ 1819118769631395840
author Rong Zhang
Alexei Drummond
author_facet Rong Zhang
Alexei Drummond
author_sort Rong Zhang
collection DOAJ
description Abstract Background Bayesian MCMC has become a common approach for phylogenetic inference. But the growing size of molecular sequence data sets has created a pressing need to improve the computational efficiency of Bayesian phylogenetic inference algorithms. Results This paper develops a new algorithm to improve the efficiency of Bayesian phylogenetic inference for models that include a per-branch rate parameter. In a Markov chain Monte Carlo algorithm, the presented proposal kernel changes evolutionary rates and divergence times at the same time, under the constraint that the implied genetic distances remain constant. Specifically, the proposal operates on the divergence time of an internal node and the three adjacent branch rates. For the root of a phylogenetic tree, there are three strategies discussed, named Simple Distance, Small Pulley and Big Pulley. Note that Big Pulley is able to change the tree topology, which enables the operator to sample all the possible rooted trees consistent with the implied unrooted tree. To validate its effectiveness, a series of experiments have been performed by implementing the proposed operator in the BEAST2 software. Conclusions The results demonstrate that the proposed operator is able to improve the performance by giving better estimates for a given chain length and by using less running time for a given level of accuracy. Measured by effective samples per hour, use of the proposed operator results in overall mixing more efficient than the current operators in BEAST2. Especially for large data sets, the improvement is up to half an order of magnitude.
first_indexed 2024-12-22T05:54:09Z
format Article
id doaj.art-c249dc42ac694147a8eb869a6640322b
institution Directory Open Access Journal
issn 1471-2148
language English
last_indexed 2024-12-22T05:54:09Z
publishDate 2020-05-01
publisher BMC
record_format Article
series BMC Evolutionary Biology
spelling doaj.art-c249dc42ac694147a8eb869a6640322b2022-12-21T18:36:48ZengBMCBMC Evolutionary Biology1471-21482020-05-0120112810.1186/s12862-020-01609-4Improving the performance of Bayesian phylogenetic inference under relaxed clock modelsRong Zhang0Alexei Drummond1School of Computer Science, University of AucklandSchool of Computer Science, University of AucklandAbstract Background Bayesian MCMC has become a common approach for phylogenetic inference. But the growing size of molecular sequence data sets has created a pressing need to improve the computational efficiency of Bayesian phylogenetic inference algorithms. Results This paper develops a new algorithm to improve the efficiency of Bayesian phylogenetic inference for models that include a per-branch rate parameter. In a Markov chain Monte Carlo algorithm, the presented proposal kernel changes evolutionary rates and divergence times at the same time, under the constraint that the implied genetic distances remain constant. Specifically, the proposal operates on the divergence time of an internal node and the three adjacent branch rates. For the root of a phylogenetic tree, there are three strategies discussed, named Simple Distance, Small Pulley and Big Pulley. Note that Big Pulley is able to change the tree topology, which enables the operator to sample all the possible rooted trees consistent with the implied unrooted tree. To validate its effectiveness, a series of experiments have been performed by implementing the proposed operator in the BEAST2 software. Conclusions The results demonstrate that the proposed operator is able to improve the performance by giving better estimates for a given chain length and by using less running time for a given level of accuracy. Measured by effective samples per hour, use of the proposed operator results in overall mixing more efficient than the current operators in BEAST2. Especially for large data sets, the improvement is up to half an order of magnitude.http://link.springer.com/article/10.1186/s12862-020-01609-4Bayesian MCMCBayesian phylogeneticsProposal kernelGenetic distancesDivergence timesEvolutionary rates
spellingShingle Rong Zhang
Alexei Drummond
Improving the performance of Bayesian phylogenetic inference under relaxed clock models
BMC Evolutionary Biology
Bayesian MCMC
Bayesian phylogenetics
Proposal kernel
Genetic distances
Divergence times
Evolutionary rates
title Improving the performance of Bayesian phylogenetic inference under relaxed clock models
title_full Improving the performance of Bayesian phylogenetic inference under relaxed clock models
title_fullStr Improving the performance of Bayesian phylogenetic inference under relaxed clock models
title_full_unstemmed Improving the performance of Bayesian phylogenetic inference under relaxed clock models
title_short Improving the performance of Bayesian phylogenetic inference under relaxed clock models
title_sort improving the performance of bayesian phylogenetic inference under relaxed clock models
topic Bayesian MCMC
Bayesian phylogenetics
Proposal kernel
Genetic distances
Divergence times
Evolutionary rates
url http://link.springer.com/article/10.1186/s12862-020-01609-4
work_keys_str_mv AT rongzhang improvingtheperformanceofbayesianphylogeneticinferenceunderrelaxedclockmodels
AT alexeidrummond improvingtheperformanceofbayesianphylogeneticinferenceunderrelaxedclockmodels