Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation

Abstract One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current mos...

Full description

Bibliographic Details
Main Authors: Yu, Xilin, Le, Thien, Christensen, Sarah A., Molloy, Erin K., Warnow, Tandy
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:English
Published: BioMed Central 2021
Online Access:https://hdl.handle.net/1721.1/136893
_version_ 1811095329976614912
author Yu, Xilin
Le, Thien
Christensen, Sarah A.
Molloy, Erin K.
Warnow, Tandy
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Yu, Xilin
Le, Thien
Christensen, Sarah A.
Molloy, Erin K.
Warnow, Tandy
author_sort Yu, Xilin
collection MIT
description Abstract One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics for NP-hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a “supertree method”. Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees is NP-hard. Exact-RFS-2 is available in open source form on Github at https://github.com/yuxilin51/GreedyRFS .
first_indexed 2024-09-23T16:15:22Z
format Article
id mit-1721.1/136893
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T16:15:22Z
publishDate 2021
publisher BioMed Central
record_format dspace
spelling mit-1721.1/1368932023-01-11T19:12:38Z Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation Yu, Xilin Le, Thien Christensen, Sarah A. Molloy, Erin K. Warnow, Tandy Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Abstract One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics for NP-hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a “supertree method”. Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees is NP-hard. Exact-RFS-2 is available in open source form on Github at https://github.com/yuxilin51/GreedyRFS . 2021-11-01T14:34:02Z 2021-11-01T14:34:02Z 2021-06-28 2021-07-04T03:21:16Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/136893 Algorithms for Molecular Biology. 2021 Jun 28;16(1):12 PUBLISHER_CC en https://doi.org/10.1186/s13015-021-00189-2 Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ The Author(s) application/pdf BioMed Central BioMed Central
spellingShingle Yu, Xilin
Le, Thien
Christensen, Sarah A.
Molloy, Erin K.
Warnow, Tandy
Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
title Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
title_full Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
title_fullStr Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
title_full_unstemmed Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
title_short Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
title_sort using robinson foulds supertrees in divide and conquer phylogeny estimation
url https://hdl.handle.net/1721.1/136893
work_keys_str_mv AT yuxilin usingrobinsonfouldssupertreesindivideandconquerphylogenyestimation
AT lethien usingrobinsonfouldssupertreesindivideandconquerphylogenyestimation
AT christensensaraha usingrobinsonfouldssupertreesindivideandconquerphylogenyestimation
AT molloyerink usingrobinsonfouldssupertreesindivideandconquerphylogenyestimation
AT warnowtandy usingrobinsonfouldssupertreesindivideandconquerphylogenyestimation