Phylo2Vec: a vector representation for binary trees
<p>Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree...
Main Authors: | , , , , , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
Oxford University Press
2024
|
_version_ | 1826313884161540096 |
---|---|
author | Penn, MJ Scheidwasser, N Khurana, MP Duchêne, DA Donnelly, CA Bhatt, S |
author_facet | Penn, MJ Scheidwasser, N Khurana, MP Duchêne, DA Donnelly, CA Bhatt, S |
author_sort | Penn, MJ |
collection | OXFORD |
description | <p>Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming languages) and readable representation of trees (e.g., Newick-format strings). Here, we present Phylo2Vec, a parsimonious encoding for phylogenetic trees that serves as a unified approach for both manipulating and representing phylogenetic trees. Phylo2Vec maps any binary tree with <em>n</em> leaves to a unique integer vector of length <em>n −</em> 1. The advantages of Phylo2Vec are fourfold: i) fast tree sampling, (ii) compressed tree representation compared to a Newick string, iii) quick and unambiguous verification if two binary trees are identical topologically, and iv) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for maximum likelihood inference on five real-world datasets and show that a simple hill-climbing-based optimisation scheme can efficiently traverse the vastness of tree space from a random to an optimal tree.</p> |
first_indexed | 2024-09-25T04:23:27Z |
format | Journal article |
id | oxford-uuid:4f2cb9e6-9607-4fa1-9249-a18d5f655b85 |
institution | University of Oxford |
language | English |
last_indexed | 2024-09-25T04:23:27Z |
publishDate | 2024 |
publisher | Oxford University Press |
record_format | dspace |
spelling | oxford-uuid:4f2cb9e6-9607-4fa1-9249-a18d5f655b852024-08-21T08:36:21ZPhylo2Vec: a vector representation for binary treesJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:4f2cb9e6-9607-4fa1-9249-a18d5f655b85EnglishSymplectic ElementsOxford University Press2024Penn, MJScheidwasser, NKhurana, MPDuchêne, DADonnelly, CABhatt, S<p>Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming languages) and readable representation of trees (e.g., Newick-format strings). Here, we present Phylo2Vec, a parsimonious encoding for phylogenetic trees that serves as a unified approach for both manipulating and representing phylogenetic trees. Phylo2Vec maps any binary tree with <em>n</em> leaves to a unique integer vector of length <em>n −</em> 1. The advantages of Phylo2Vec are fourfold: i) fast tree sampling, (ii) compressed tree representation compared to a Newick string, iii) quick and unambiguous verification if two binary trees are identical topologically, and iv) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for maximum likelihood inference on five real-world datasets and show that a simple hill-climbing-based optimisation scheme can efficiently traverse the vastness of tree space from a random to an optimal tree.</p> |
spellingShingle | Penn, MJ Scheidwasser, N Khurana, MP Duchêne, DA Donnelly, CA Bhatt, S Phylo2Vec: a vector representation for binary trees |
title | Phylo2Vec: a vector representation for binary trees |
title_full | Phylo2Vec: a vector representation for binary trees |
title_fullStr | Phylo2Vec: a vector representation for binary trees |
title_full_unstemmed | Phylo2Vec: a vector representation for binary trees |
title_short | Phylo2Vec: a vector representation for binary trees |
title_sort | phylo2vec a vector representation for binary trees |
work_keys_str_mv | AT pennmj phylo2vecavectorrepresentationforbinarytrees AT scheidwassern phylo2vecavectorrepresentationforbinarytrees AT khuranamp phylo2vecavectorrepresentationforbinarytrees AT ducheneda phylo2vecavectorrepresentationforbinarytrees AT donnellyca phylo2vecavectorrepresentationforbinarytrees AT bhatts phylo2vecavectorrepresentationforbinarytrees |