Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes.

The spatial arrangements of secondary structures in proteins, irrespective of their connectivity, depict the overall shape and organization of protein domains. These features have been used in the CATH and SCOP classifications to hierarchically partition fold space and define the architectural make...

Full description

Bibliographic Details
Main Authors: Syed Abbas Bukhari, Gustavo Caetano-Anollés
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS Computational Biology
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23555236/pdf/?tool=EBI
_version_ 1818574383295234048
author Syed Abbas Bukhari
Gustavo Caetano-Anollés
author_facet Syed Abbas Bukhari
Gustavo Caetano-Anollés
author_sort Syed Abbas Bukhari
collection DOAJ
description The spatial arrangements of secondary structures in proteins, irrespective of their connectivity, depict the overall shape and organization of protein domains. These features have been used in the CATH and SCOP classifications to hierarchically partition fold space and define the architectural make up of proteins. Here we use phylogenomic methods and a census of CATH structures in hundreds of genomes to study the origin and diversification of protein architectures (A) and their associated topologies (T) and superfamilies (H). Phylogenies that describe the evolution of domain structures and proteomes were reconstructed from the structural census and used to generate timelines of domain discovery. Phylogenies of CATH domains at T and H levels of structural abstraction and associated chronologies revealed patterns of reductive evolution, the early rise of Archaea, three epochs in the evolution of the protein world, and patterns of structural sharing between superkingdoms. Phylogenies of proteomes confirmed the early appearance of Archaea. While these findings are in agreement with previous phylogenomic studies based on the SCOP classification, phylogenies unveiled sharing patterns between Archaea and Eukarya that are recent and can explain the canonical bacterial rooting typically recovered from sequence analysis. Phylogenies of CATH domains at A level uncovered general patterns of architectural origin and diversification. The tree of A structures showed that ancient structural designs such as the 3-layer (αβα) sandwich (3.40) or the orthogonal bundle (1.10) are comparatively simpler in their makeup and are involved in basic cellular functions. In contrast, modern structural designs such as prisms, propellers, 2-solenoid, super-roll, clam, trefoil and box are not widely distributed and were probably adopted to perform specialized functions. Our timelines therefore uncover a universal tendency towards protein structural complexity that is remarkable.
first_indexed 2024-12-15T00:25:55Z
format Article
id doaj.art-9ce1720015e3481b82b564b942dc3958
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-12-15T00:25:55Z
publishDate 2013-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-9ce1720015e3481b82b564b942dc39582022-12-21T22:42:10ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582013-01-0193e100300910.1371/journal.pcbi.1003009Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes.Syed Abbas BukhariGustavo Caetano-AnollésThe spatial arrangements of secondary structures in proteins, irrespective of their connectivity, depict the overall shape and organization of protein domains. These features have been used in the CATH and SCOP classifications to hierarchically partition fold space and define the architectural make up of proteins. Here we use phylogenomic methods and a census of CATH structures in hundreds of genomes to study the origin and diversification of protein architectures (A) and their associated topologies (T) and superfamilies (H). Phylogenies that describe the evolution of domain structures and proteomes were reconstructed from the structural census and used to generate timelines of domain discovery. Phylogenies of CATH domains at T and H levels of structural abstraction and associated chronologies revealed patterns of reductive evolution, the early rise of Archaea, three epochs in the evolution of the protein world, and patterns of structural sharing between superkingdoms. Phylogenies of proteomes confirmed the early appearance of Archaea. While these findings are in agreement with previous phylogenomic studies based on the SCOP classification, phylogenies unveiled sharing patterns between Archaea and Eukarya that are recent and can explain the canonical bacterial rooting typically recovered from sequence analysis. Phylogenies of CATH domains at A level uncovered general patterns of architectural origin and diversification. The tree of A structures showed that ancient structural designs such as the 3-layer (αβα) sandwich (3.40) or the orthogonal bundle (1.10) are comparatively simpler in their makeup and are involved in basic cellular functions. In contrast, modern structural designs such as prisms, propellers, 2-solenoid, super-roll, clam, trefoil and box are not widely distributed and were probably adopted to perform specialized functions. Our timelines therefore uncover a universal tendency towards protein structural complexity that is remarkable.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23555236/pdf/?tool=EBI
spellingShingle Syed Abbas Bukhari
Gustavo Caetano-Anollés
Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes.
PLoS Computational Biology
title Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes.
title_full Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes.
title_fullStr Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes.
title_full_unstemmed Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes.
title_short Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes.
title_sort origin and evolution of protein fold designs inferred from phylogenomic analysis of cath domain structures in proteomes
url https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23555236/pdf/?tool=EBI
work_keys_str_mv AT syedabbasbukhari originandevolutionofproteinfolddesignsinferredfromphylogenomicanalysisofcathdomainstructuresinproteomes
AT gustavocaetanoanolles originandevolutionofproteinfolddesignsinferredfromphylogenomicanalysisofcathdomainstructuresinproteomes