A maximum pseudo-likelihood approach for estimating species trees under the coalescent model
<p>Abstract</p> <p>Background</p> <p>Several phylogenetic approaches have been developed to estimate species trees from collections of gene trees. However, maximum likelihood approaches for estimating species trees under the coalescent model are limited. Although the li...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2010-10-01
|
Series: | BMC Evolutionary Biology |
Online Access: | http://www.biomedcentral.com/1471-2148/10/302 |
_version_ | 1818390609170268160 |
---|---|
author | Edwards Scott V Yu Lili Liu Liang |
author_facet | Edwards Scott V Yu Lili Liu Liang |
author_sort | Edwards Scott V |
collection | DOAJ |
description | <p>Abstract</p> <p>Background</p> <p>Several phylogenetic approaches have been developed to estimate species trees from collections of gene trees. However, maximum likelihood approaches for estimating species trees under the coalescent model are limited. Although the likelihood of a species tree under the multispecies coalescent model has already been derived by Rannala and Yang, it can be shown that the maximum likelihood estimate (MLE) of the species tree (topology, branch lengths, and population sizes) from gene trees under this formula does not exist. In this paper, we develop a pseudo-likelihood function of the species tree to obtain maximum pseudo-likelihood estimates (MPE) of species trees, with branch lengths of the species tree in coalescent units.</p> <p>Results</p> <p>We show that the MPE of the species tree is statistically consistent as the number <it>M </it>of genes goes to infinity. In addition, the probability that the MPE of the species tree matches the true species tree converges to 1 at rate <it>O</it>(<it>M </it><sup>-1</sup>). The simulation results confirm that the maximum pseudo-likelihood approach is statistically consistent even when the species tree is in the anomaly zone. We applied our method, Maximum Pseudo-likelihood for Estimating Species Trees (MP-EST) to a mammal dataset. The four major clades found in the MP-EST tree are consistent with those in the Bayesian concatenation tree. The bootstrap supports for the species tree estimated by the MP-EST method are more reasonable than the posterior probability supports given by the Bayesian concatenation method in reflecting the level of uncertainty in gene trees and controversies over the relationship of four major groups of placental mammals.</p> <p>Conclusions</p> <p>MP-EST can consistently estimate the topology and branch lengths (in coalescent units) of the species tree. Although the pseudo-likelihood is derived from coalescent theory, and assumes no gene flow or horizontal gene transfer (HGT), the MP-EST method is robust to a small amount of HGT in the dataset. In addition, increasing the number of genes does not increase the computational time substantially. The MP-EST method is fast for analyzing datasets that involve a large number of genes but a moderate number of species.</p> |
first_indexed | 2024-12-14T05:00:21Z |
format | Article |
id | doaj.art-a13cb8df0a8243f4b424aab848a0331c |
institution | Directory Open Access Journal |
issn | 1471-2148 |
language | English |
last_indexed | 2024-12-14T05:00:21Z |
publishDate | 2010-10-01 |
publisher | BMC |
record_format | Article |
series | BMC Evolutionary Biology |
spelling | doaj.art-a13cb8df0a8243f4b424aab848a0331c2022-12-21T23:16:15ZengBMCBMC Evolutionary Biology1471-21482010-10-0110130210.1186/1471-2148-10-302A maximum pseudo-likelihood approach for estimating species trees under the coalescent modelEdwards Scott VYu LiliLiu Liang<p>Abstract</p> <p>Background</p> <p>Several phylogenetic approaches have been developed to estimate species trees from collections of gene trees. However, maximum likelihood approaches for estimating species trees under the coalescent model are limited. Although the likelihood of a species tree under the multispecies coalescent model has already been derived by Rannala and Yang, it can be shown that the maximum likelihood estimate (MLE) of the species tree (topology, branch lengths, and population sizes) from gene trees under this formula does not exist. In this paper, we develop a pseudo-likelihood function of the species tree to obtain maximum pseudo-likelihood estimates (MPE) of species trees, with branch lengths of the species tree in coalescent units.</p> <p>Results</p> <p>We show that the MPE of the species tree is statistically consistent as the number <it>M </it>of genes goes to infinity. In addition, the probability that the MPE of the species tree matches the true species tree converges to 1 at rate <it>O</it>(<it>M </it><sup>-1</sup>). The simulation results confirm that the maximum pseudo-likelihood approach is statistically consistent even when the species tree is in the anomaly zone. We applied our method, Maximum Pseudo-likelihood for Estimating Species Trees (MP-EST) to a mammal dataset. The four major clades found in the MP-EST tree are consistent with those in the Bayesian concatenation tree. The bootstrap supports for the species tree estimated by the MP-EST method are more reasonable than the posterior probability supports given by the Bayesian concatenation method in reflecting the level of uncertainty in gene trees and controversies over the relationship of four major groups of placental mammals.</p> <p>Conclusions</p> <p>MP-EST can consistently estimate the topology and branch lengths (in coalescent units) of the species tree. Although the pseudo-likelihood is derived from coalescent theory, and assumes no gene flow or horizontal gene transfer (HGT), the MP-EST method is robust to a small amount of HGT in the dataset. In addition, increasing the number of genes does not increase the computational time substantially. The MP-EST method is fast for analyzing datasets that involve a large number of genes but a moderate number of species.</p>http://www.biomedcentral.com/1471-2148/10/302 |
spellingShingle | Edwards Scott V Yu Lili Liu Liang A maximum pseudo-likelihood approach for estimating species trees under the coalescent model BMC Evolutionary Biology |
title | A maximum pseudo-likelihood approach for estimating species trees under the coalescent model |
title_full | A maximum pseudo-likelihood approach for estimating species trees under the coalescent model |
title_fullStr | A maximum pseudo-likelihood approach for estimating species trees under the coalescent model |
title_full_unstemmed | A maximum pseudo-likelihood approach for estimating species trees under the coalescent model |
title_short | A maximum pseudo-likelihood approach for estimating species trees under the coalescent model |
title_sort | maximum pseudo likelihood approach for estimating species trees under the coalescent model |
url | http://www.biomedcentral.com/1471-2148/10/302 |
work_keys_str_mv | AT edwardsscottv amaximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel AT yulili amaximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel AT liuliang amaximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel AT edwardsscottv maximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel AT yulili maximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel AT liuliang maximumpseudolikelihoodapproachforestimatingspeciestreesunderthecoalescentmodel |