Long-branch attraction bias and inconsistency in Bayesian phylogenetics.

Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model...

Full description

Bibliographic Details
Main Authors:	Bryan Kolaczkowski, Joseph W Thornton
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2009-12-01
Series:	PLoS ONE
Online Access:	http://europepmc.org/articles/PMC2785476?pdf=render

_version_	1818924553871556608
author	Bryan Kolaczkowski Joseph W Thornton
author_facet	Bryan Kolaczkowski Joseph W Thornton
author_sort	Bryan Kolaczkowski
collection	DOAJ
description	Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.
first_indexed	2024-12-20T02:27:10Z
format	Article
id	doaj.art-9e02c5a751224b5da76cc63d92373a8e
institution	Directory Open Access Journal
issn	1932-6203
language	English
last_indexed	2024-12-20T02:27:10Z
publishDate	2009-12-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj.art-9e02c5a751224b5da76cc63d92373a8e2022-12-21T19:56:40ZengPublic Library of Science (PLoS)PLoS ONE1932-62032009-12-01412e789110.1371/journal.pone.0007891Long-branch attraction bias and inconsistency in Bayesian phylogenetics.Bryan KolaczkowskiJoseph W ThorntonBayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.http://europepmc.org/articles/PMC2785476?pdf=render
spellingShingle	Bryan Kolaczkowski Joseph W Thornton Long-branch attraction bias and inconsistency in Bayesian phylogenetics. PLoS ONE
title	Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
title_full	Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
title_fullStr	Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
title_full_unstemmed	Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
title_short	Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
title_sort	long branch attraction bias and inconsistency in bayesian phylogenetics
url	http://europepmc.org/articles/PMC2785476?pdf=render
work_keys_str_mv	AT bryankolaczkowski longbranchattractionbiasandinconsistencyinbayesianphylogenetics AT josephwthornton longbranchattractionbiasandinconsistencyinbayesianphylogenetics

Long-branch attraction bias and inconsistency in Bayesian phylogenetics.

Similar Items