Practical guidelines for Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) [version 1; peer review: 2 approved, 1 approved with reservations]

Phylogenetic estimation is, and has always been, a complex endeavor. Estimating a phylogenetic tree involves evaluating many possible solutions and possible evolutionary histories that could explain a set of observed data, typically by using a model of evolution. Modern statistical methods involve n...

Full description

Bibliographic Details
Main Authors: Orlando Schwery, Joëlle Barido-Sottani, Chi Zhang, Rachel C. M. Warnock, April Marie Wright
Format: Article
Language:English
Published: F1000 Research Ltd 2023-11-01
Series:Open Research Europe
Subjects:
Online Access:https://open-research-europe.ec.europa.eu/articles/3-204/v1
_version_ 1797263726927151104
author Orlando Schwery
Joëlle Barido-Sottani
Chi Zhang
Rachel C. M. Warnock
April Marie Wright
author_facet Orlando Schwery
Joëlle Barido-Sottani
Chi Zhang
Rachel C. M. Warnock
April Marie Wright
author_sort Orlando Schwery
collection DOAJ
description Phylogenetic estimation is, and has always been, a complex endeavor. Estimating a phylogenetic tree involves evaluating many possible solutions and possible evolutionary histories that could explain a set of observed data, typically by using a model of evolution. Modern statistical methods involve not just the estimation of a tree, but also solutions to more complex models involving fossil record information and other data sources. Markov Chain Monte Carlo (MCMC) is a leading method for approximating the posterior distribution of parameters in a mathematical model. It is deployed in all Bayesian phylogenetic tree estimation software. While many researchers use MCMC in phylogenetic analyses, interpreting results and diagnosing problems with MCMC remain vexing issues to many biologists. In this manuscript, we will offer an overview of how MCMC is used in Bayesian phylogenetic inference, with a particular emphasis on complex hierarchical models, such as the fossilized birth-death (FBD) model. We will discuss strategies to diagnose common MCMC problems and troubleshoot difficult analyses, in particular convergence issues. We will show how the study design, the choice of models and priors, but also technical features of the inference tools themselves can all be adjusted to obtain the best results. Finally, we will also discuss the unique challenges created by the incorporation of fossil information in phylogenetic inference, and present tips to address them.
first_indexed 2024-04-25T00:17:36Z
format Article
id doaj.art-4f50f579f47547be9697d08ca2ce3ef4
institution Directory Open Access Journal
issn 2732-5121
language English
last_indexed 2024-04-25T00:17:36Z
publishDate 2023-11-01
publisher F1000 Research Ltd
record_format Article
series Open Research Europe
spelling doaj.art-4f50f579f47547be9697d08ca2ce3ef42024-03-13T01:00:00ZengF1000 Research LtdOpen Research Europe2732-51212023-11-01318012Practical guidelines for Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) [version 1; peer review: 2 approved, 1 approved with reservations]Orlando Schwery0Joëlle Barido-Sottani1https://orcid.org/0000-0002-5220-5468Chi Zhang2Rachel C. M. Warnock3April Marie Wright4Department of Biological Sciences, Southeastern Louisiana University, Hammond, Louisiana, 70402, USAInstitut de Biologie de l’ENS (IBENS), École normale supérieure, CNRS, INSERM, Université PSL, Paris, Île-de-France, 75005, FranceKey Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, 100044, ChinaGeoZentrum Nordbayern, Department of Geography and Geosciences, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Bavaria, 91054, GermanyDepartment of Biological Sciences, Southeastern Louisiana University, Hammond, Louisiana, 70402, USAPhylogenetic estimation is, and has always been, a complex endeavor. Estimating a phylogenetic tree involves evaluating many possible solutions and possible evolutionary histories that could explain a set of observed data, typically by using a model of evolution. Modern statistical methods involve not just the estimation of a tree, but also solutions to more complex models involving fossil record information and other data sources. Markov Chain Monte Carlo (MCMC) is a leading method for approximating the posterior distribution of parameters in a mathematical model. It is deployed in all Bayesian phylogenetic tree estimation software. While many researchers use MCMC in phylogenetic analyses, interpreting results and diagnosing problems with MCMC remain vexing issues to many biologists. In this manuscript, we will offer an overview of how MCMC is used in Bayesian phylogenetic inference, with a particular emphasis on complex hierarchical models, such as the fossilized birth-death (FBD) model. We will discuss strategies to diagnose common MCMC problems and troubleshoot difficult analyses, in particular convergence issues. We will show how the study design, the choice of models and priors, but also technical features of the inference tools themselves can all be adjusted to obtain the best results. Finally, we will also discuss the unique challenges created by the incorporation of fossil information in phylogenetic inference, and present tips to address them.https://open-research-europe.ec.europa.eu/articles/3-204/v1Bayesian phylogenetic inference MCMC troubleshooting phylogenetic inference software fossilized birth-death total-evidenceeng
spellingShingle Orlando Schwery
Joëlle Barido-Sottani
Chi Zhang
Rachel C. M. Warnock
April Marie Wright
Practical guidelines for Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) [version 1; peer review: 2 approved, 1 approved with reservations]
Open Research Europe
Bayesian phylogenetic inference
MCMC
troubleshooting
phylogenetic inference software
fossilized birth-death
total-evidence
eng
title Practical guidelines for Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) [version 1; peer review: 2 approved, 1 approved with reservations]
title_full Practical guidelines for Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) [version 1; peer review: 2 approved, 1 approved with reservations]
title_fullStr Practical guidelines for Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) [version 1; peer review: 2 approved, 1 approved with reservations]
title_full_unstemmed Practical guidelines for Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) [version 1; peer review: 2 approved, 1 approved with reservations]
title_short Practical guidelines for Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) [version 1; peer review: 2 approved, 1 approved with reservations]
title_sort practical guidelines for bayesian phylogenetic inference using markov chain monte carlo mcmc version 1 peer review 2 approved 1 approved with reservations
topic Bayesian phylogenetic inference
MCMC
troubleshooting
phylogenetic inference software
fossilized birth-death
total-evidence
eng
url https://open-research-europe.ec.europa.eu/articles/3-204/v1
work_keys_str_mv AT orlandoschwery practicalguidelinesforbayesianphylogeneticinferenceusingmarkovchainmontecarlomcmcversion1peerreview2approved1approvedwithreservations
AT joellebaridosottani practicalguidelinesforbayesianphylogeneticinferenceusingmarkovchainmontecarlomcmcversion1peerreview2approved1approvedwithreservations
AT chizhang practicalguidelinesforbayesianphylogeneticinferenceusingmarkovchainmontecarlomcmcversion1peerreview2approved1approvedwithreservations
AT rachelcmwarnock practicalguidelinesforbayesianphylogeneticinferenceusingmarkovchainmontecarlomcmcversion1peerreview2approved1approvedwithreservations
AT aprilmariewright practicalguidelinesforbayesianphylogeneticinferenceusingmarkovchainmontecarlomcmcversion1peerreview2approved1approvedwithreservations