Genome trees from conservation profiles.

The concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data...

Full description

Bibliographic Details
Main Authors: Fredj Tekaia, Edouard Yeramian
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2005-12-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC1314884?pdf=render
_version_ 1811268311185358848
author Fredj Tekaia
Edouard Yeramian
author_facet Fredj Tekaia
Edouard Yeramian
author_sort Fredj Tekaia
collection DOAJ
description The concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data it becomes of increasing importance to develop accurate methods for grasping global trends for the phylogenetic signals that mutually link the various genomes. We therefore derive here the methodological concept of genome trees based on protein conservation profiles in multiple species. The basic idea in this derivation is that the multi-component "presence-absence" protein conservation profiles permit tracking of common evolutionary histories of genes across multiple genomes. We show that a significant reduction in informational redundancy is achieved by considering only the subset of distinct conservation profiles. Beyond these basic ideas, we point out various pitfalls and limitations associated with the data handling, paving the way for further improvements. As an illustration for the methods, we analyze a genome tree based on the above principles, along with a series of other trees derived from the same data and based on pair-wise comparisons (ancestral duplication-conservation and shared orthologs). In all trees we observe a sharp discrimination between the three primary domains of life: Bacteria, Archaea, and Eukarya. The new genome tree, based on conservation profiles, displays a significant correspondence with classically recognized taxonomical groupings, along with a series of departures from such conventional clusterings.
first_indexed 2024-04-12T21:19:24Z
format Article
id doaj.art-a7fd926628814860a566ed39be490716
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-04-12T21:19:24Z
publishDate 2005-12-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-a7fd926628814860a566ed39be4907162022-12-22T03:16:20ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582005-12-0117e7510.1371/journal.pcbi.0010075Genome trees from conservation profiles.Fredj TekaiaEdouard YeramianThe concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data it becomes of increasing importance to develop accurate methods for grasping global trends for the phylogenetic signals that mutually link the various genomes. We therefore derive here the methodological concept of genome trees based on protein conservation profiles in multiple species. The basic idea in this derivation is that the multi-component "presence-absence" protein conservation profiles permit tracking of common evolutionary histories of genes across multiple genomes. We show that a significant reduction in informational redundancy is achieved by considering only the subset of distinct conservation profiles. Beyond these basic ideas, we point out various pitfalls and limitations associated with the data handling, paving the way for further improvements. As an illustration for the methods, we analyze a genome tree based on the above principles, along with a series of other trees derived from the same data and based on pair-wise comparisons (ancestral duplication-conservation and shared orthologs). In all trees we observe a sharp discrimination between the three primary domains of life: Bacteria, Archaea, and Eukarya. The new genome tree, based on conservation profiles, displays a significant correspondence with classically recognized taxonomical groupings, along with a series of departures from such conventional clusterings.http://europepmc.org/articles/PMC1314884?pdf=render
spellingShingle Fredj Tekaia
Edouard Yeramian
Genome trees from conservation profiles.
PLoS Computational Biology
title Genome trees from conservation profiles.
title_full Genome trees from conservation profiles.
title_fullStr Genome trees from conservation profiles.
title_full_unstemmed Genome trees from conservation profiles.
title_short Genome trees from conservation profiles.
title_sort genome trees from conservation profiles
url http://europepmc.org/articles/PMC1314884?pdf=render
work_keys_str_mv AT fredjtekaia genometreesfromconservationprofiles
AT edouardyeramian genometreesfromconservationprofiles