A maximum pseudo-likelihood approach for estimating species trees under the coalescent model

<p>Abstract</p> <p>Background</p> <p>Several phylogenetic approaches have been developed to estimate species trees from collections of gene trees. However, maximum likelihood approaches for estimating species trees under the coalescent model are limited. Although the li...

Full description

Bibliographic Details
Main Authors: Edwards Scott V, Yu Lili, Liu Liang
Format: Article
Language:English
Published: BMC 2010-10-01
Series:BMC Evolutionary Biology
Online Access:http://www.biomedcentral.com/1471-2148/10/302
_version_ 1818390609170268160
author Edwards Scott V
Yu Lili
Liu Liang
author_facet Edwards Scott V
Yu Lili
Liu Liang
author_sort Edwards Scott V
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Several phylogenetic approaches have been developed to estimate species trees from collections of gene trees. However, maximum likelihood approaches for estimating species trees under the coalescent model are limited. Although the likelihood of a species tree under the multispecies coalescent model has already been derived by Rannala and Yang, it can be shown that the maximum likelihood estimate (MLE) of the species tree (topology, branch lengths, and population sizes) from gene trees under this formula does not exist. In this paper, we develop a pseudo-likelihood function of the species tree to obtain maximum pseudo-likelihood estimates (MPE) of species trees, with branch lengths of the species tree in coalescent units.</p> <p>Results</p> <p>We show that the MPE of the species tree is statistically consistent as the number <it>M </it>of genes goes to infinity. In addition, the probability that the MPE of the species tree matches the true species tree converges to 1 at rate <it>O</it>(<it>M </it><sup>-1</sup>). The simulation results confirm that the maximum pseudo-likelihood approach is statistically consistent even when the species tree is in the anomaly zone. We applied our method, Maximum Pseudo-likelihood for Estimating Species Trees (MP-EST) to a mammal dataset. The four major clades found in the MP-EST tree are consistent with those in the Bayesian concatenation tree. The bootstrap supports for the species tree estimated by the MP-EST method are more reasonable than the posterior probability supports given by the Bayesian concatenation method in reflecting the level of uncertainty in gene trees and controversies over the relationship of four major groups of placental mammals.</p> <p>Conclusions</p> <p>MP-EST can consistently estimate the topology and branch lengths (in coalescent units) of the species tree. Although the pseudo-likelihood is derived from coalescent theory, and assumes no gene flow or horizontal gene transfer (HGT), the MP-EST method is robust to a small amount of HGT in the dataset. In addition, increasing the number of genes does not increase the computational time substantially. The MP-EST method is fast for analyzing datasets that involve a large number of genes but a moderate number of species.</p>
first_indexed 2024-12-14T05:00:21Z
format Article
id doaj.art-a13cb8df0a8243f4b424aab848a0331c
institution Directory Open Access Journal
issn 1471-2148
language English
last_indexed 2024-12-14T05:00:21Z
publishDate 2010-10-01
publisher BMC
record_format Article
series BMC Evolutionary Biology
spelling doaj.art-a13cb8df0a8243f4b424aab848a0331c2022-12-21T23:16:15ZengBMCBMC Evolutionary Biology1471-21482010-10-0110130210.1186/1471-2148-10-302A maximum pseudo-likelihood approach for estimating species trees under the coalescent modelEdwards Scott VYu LiliLiu Liang<p>Abstract</p> <p>Background</p> <p>Several phylogenetic approaches have been developed to estimate species trees from collections of gene trees. However, maximum likelihood approaches for estimating species trees under the coalescent model are limited. Although the likelihood of a species tree under the multispecies coalescent model has already been derived by Rannala and Yang, it can be shown that the maximum likelihood estimate (MLE) of the species tree (topology, branch lengths, and population sizes) from gene trees under this formula does not exist. In this paper, we develop a pseudo-likelihood function of the species tree to obtain maximum pseudo-likelihood estimates (MPE) of species trees, with branch lengths of the species tree in coalescent units.</p> <p>Results</p> <p>We show that the MPE of the species tree is statistically consistent as the number <it>M </it>of genes goes to infinity. In addition, the probability that the MPE of the species tree matches the true species tree converges to 1 at rate <it>O</it>(<it>M </it><sup>-1</sup>). The simulation results confirm that the maximum pseudo-likelihood approach is statistically consistent even when the species tree is in the anomaly zone. We applied our method, Maximum Pseudo-likelihood for Estimating Species Trees (MP-EST) to a mammal dataset. The four major clades found in the MP-EST tree are consistent with those in the Bayesian concatenation tree. The bootstrap supports for the species tree estimated by the MP-EST method are more reasonable than the posterior probability supports given by the Bayesian concatenation method in reflecting the level of uncertainty in gene trees and controversies over the relationship of four major groups of placental mammals.</p> <p>Conclusions</p> <p>MP-EST can consistently estimate the topology and branch lengths (in coalescent units) of the species tree. Although the pseudo-likelihood is derived from coalescent theory, and assumes no gene flow or horizontal gene transfer (HGT), the MP-EST method is robust to a small amount of HGT in the dataset. In addition, increasing the number of genes does not increase the computational time substantially. The MP-EST method is fast for analyzing datasets that involve a large number of genes but a moderate number of species.</p>http://www.biomedcentral.com/1471-2148/10/302
spellingShingle Edwards Scott V
Yu Lili
Liu Liang
A maximum pseudo-likelihood approach for estimating species trees under the coalescent model
BMC Evolutionary Biology
title A maximum pseudo-likelihood approach for estimating species trees under the coalescent model
title_full A maximum pseudo-likelihood approach for estimating species trees under the coalescent model
title_fullStr A maximum pseudo-likelihood approach for estimating species trees under the coalescent model
title_full_unstemmed A maximum pseudo-likelihood approach for estimating species trees under the coalescent model
title_short A maximum pseudo-likelihood approach for estimating species trees under the coalescent model
title_sort maximum pseudo likelihood approach for estimating species trees under the coalescent model
url http://www.biomedcentral.com/1471-2148/10/302
work_keys_str_mv AT edwardsscottv amaximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel
AT yulili amaximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel
AT liuliang amaximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel
AT edwardsscottv maximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel
AT yulili maximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel
AT liuliang maximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel