Genome trees from conservation profiles.
The concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2005-12-01
|
Series: | PLoS Computational Biology |
Online Access: | http://europepmc.org/articles/PMC1314884?pdf=render |
_version_ | 1811268311185358848 |
---|---|
author | Fredj Tekaia Edouard Yeramian |
author_facet | Fredj Tekaia Edouard Yeramian |
author_sort | Fredj Tekaia |
collection | DOAJ |
description | The concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data it becomes of increasing importance to develop accurate methods for grasping global trends for the phylogenetic signals that mutually link the various genomes. We therefore derive here the methodological concept of genome trees based on protein conservation profiles in multiple species. The basic idea in this derivation is that the multi-component "presence-absence" protein conservation profiles permit tracking of common evolutionary histories of genes across multiple genomes. We show that a significant reduction in informational redundancy is achieved by considering only the subset of distinct conservation profiles. Beyond these basic ideas, we point out various pitfalls and limitations associated with the data handling, paving the way for further improvements. As an illustration for the methods, we analyze a genome tree based on the above principles, along with a series of other trees derived from the same data and based on pair-wise comparisons (ancestral duplication-conservation and shared orthologs). In all trees we observe a sharp discrimination between the three primary domains of life: Bacteria, Archaea, and Eukarya. The new genome tree, based on conservation profiles, displays a significant correspondence with classically recognized taxonomical groupings, along with a series of departures from such conventional clusterings. |
first_indexed | 2024-04-12T21:19:24Z |
format | Article |
id | doaj.art-a7fd926628814860a566ed39be490716 |
institution | Directory Open Access Journal |
issn | 1553-734X 1553-7358 |
language | English |
last_indexed | 2024-04-12T21:19:24Z |
publishDate | 2005-12-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Computational Biology |
spelling | doaj.art-a7fd926628814860a566ed39be4907162022-12-22T03:16:20ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582005-12-0117e7510.1371/journal.pcbi.0010075Genome trees from conservation profiles.Fredj TekaiaEdouard YeramianThe concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data it becomes of increasing importance to develop accurate methods for grasping global trends for the phylogenetic signals that mutually link the various genomes. We therefore derive here the methodological concept of genome trees based on protein conservation profiles in multiple species. The basic idea in this derivation is that the multi-component "presence-absence" protein conservation profiles permit tracking of common evolutionary histories of genes across multiple genomes. We show that a significant reduction in informational redundancy is achieved by considering only the subset of distinct conservation profiles. Beyond these basic ideas, we point out various pitfalls and limitations associated with the data handling, paving the way for further improvements. As an illustration for the methods, we analyze a genome tree based on the above principles, along with a series of other trees derived from the same data and based on pair-wise comparisons (ancestral duplication-conservation and shared orthologs). In all trees we observe a sharp discrimination between the three primary domains of life: Bacteria, Archaea, and Eukarya. The new genome tree, based on conservation profiles, displays a significant correspondence with classically recognized taxonomical groupings, along with a series of departures from such conventional clusterings.http://europepmc.org/articles/PMC1314884?pdf=render |
spellingShingle | Fredj Tekaia Edouard Yeramian Genome trees from conservation profiles. PLoS Computational Biology |
title | Genome trees from conservation profiles. |
title_full | Genome trees from conservation profiles. |
title_fullStr | Genome trees from conservation profiles. |
title_full_unstemmed | Genome trees from conservation profiles. |
title_short | Genome trees from conservation profiles. |
title_sort | genome trees from conservation profiles |
url | http://europepmc.org/articles/PMC1314884?pdf=render |
work_keys_str_mv | AT fredjtekaia genometreesfromconservationprofiles AT edouardyeramian genometreesfromconservationprofiles |