Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 1; referees: 1 approved, 2 approved with reservations]

A functional comparative genome analysis is essential to understand the mechanisms underlying bacterial evolution and adaptation. Detection of functional orthologs using standard global sequence similarity methods faces several problems; the need for defining arbitrary acceptance thresholds for simi...

Full description

Bibliographic Details
Main Authors: Jasper J. Koehorst, Edoardo Saccenti, Peter J. Schaap, Vitor A. P. Martins dos Santos, Maria Suarez-Diez
Format: Article
Language:English
Published: F1000 Research Ltd 2016-08-01
Series:F1000Research
Subjects:
Online Access:http://f1000research.com/articles/5-1987/v1
_version_ 1828894405097947136
author Jasper J. Koehorst
Edoardo Saccenti
Peter J. Schaap
Vitor A. P. Martins dos Santos
Maria Suarez-Diez
author_facet Jasper J. Koehorst
Edoardo Saccenti
Peter J. Schaap
Vitor A. P. Martins dos Santos
Maria Suarez-Diez
author_sort Jasper J. Koehorst
collection DOAJ
description A functional comparative genome analysis is essential to understand the mechanisms underlying bacterial evolution and adaptation. Detection of functional orthologs using standard global sequence similarity methods faces several problems; the need for defining arbitrary acceptance thresholds for similarity and alignment length, lateral gene acquisition and the high computational cost for finding bi-directional best matches at a large scale. We investigated the use of protein domain architectures for large scale functional comparative analysis as an alternative method. The performance of both approaches was assessed through functional comparison of 446 bacterial genomes sampled at different taxonomic levels. We show that protein domain architectures provide a fast and efficient alternative to methods based on sequence similarity to identify groups of functionally equivalent proteins within and across taxonomic bounderies. As the computational cost scales linearly, and not quadratically with the number of genomes, it is suitable for large scale comparative analysis. Running both methods in parallel pinpoints potential functional adaptations that may add to bacterial fitness.
first_indexed 2024-12-13T14:09:59Z
format Article
id doaj.art-22dc75771d6846f5a1f3de6b3c6f88a9
institution Directory Open Access Journal
issn 2046-1402
language English
last_indexed 2024-12-13T14:09:59Z
publishDate 2016-08-01
publisher F1000 Research Ltd
record_format Article
series F1000Research
spelling doaj.art-22dc75771d6846f5a1f3de6b3c6f88a92022-12-21T23:42:29ZengF1000 Research LtdF1000Research2046-14022016-08-01510.12688/f1000research.9416.110140Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 1; referees: 1 approved, 2 approved with reservations]Jasper J. Koehorst0Edoardo Saccenti1Peter J. Schaap2Vitor A. P. Martins dos Santos3Maria Suarez-Diez4Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng, NetherlandsLaboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng, NetherlandsLaboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng, NetherlandsLifeGlimmer, GmBH, Berlin, GermanyLaboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng, NetherlandsA functional comparative genome analysis is essential to understand the mechanisms underlying bacterial evolution and adaptation. Detection of functional orthologs using standard global sequence similarity methods faces several problems; the need for defining arbitrary acceptance thresholds for similarity and alignment length, lateral gene acquisition and the high computational cost for finding bi-directional best matches at a large scale. We investigated the use of protein domain architectures for large scale functional comparative analysis as an alternative method. The performance of both approaches was assessed through functional comparison of 446 bacterial genomes sampled at different taxonomic levels. We show that protein domain architectures provide a fast and efficient alternative to methods based on sequence similarity to identify groups of functionally equivalent proteins within and across taxonomic bounderies. As the computational cost scales linearly, and not quadratically with the number of genomes, it is suitable for large scale comparative analysis. Running both methods in parallel pinpoints potential functional adaptations that may add to bacterial fitness.http://f1000research.com/articles/5-1987/v1GenomicsMicrobial Evolution & Genomics
spellingShingle Jasper J. Koehorst
Edoardo Saccenti
Peter J. Schaap
Vitor A. P. Martins dos Santos
Maria Suarez-Diez
Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 1; referees: 1 approved, 2 approved with reservations]
F1000Research
Genomics
Microbial Evolution & Genomics
title Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 1; referees: 1 approved, 2 approved with reservations]
title_full Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 1; referees: 1 approved, 2 approved with reservations]
title_fullStr Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 1; referees: 1 approved, 2 approved with reservations]
title_full_unstemmed Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 1; referees: 1 approved, 2 approved with reservations]
title_short Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 1; referees: 1 approved, 2 approved with reservations]
title_sort protein domain architectures provide a fast efficient and scalable alternative to sequence based methods for comparative functional genomics version 1 referees 1 approved 2 approved with reservations
topic Genomics
Microbial Evolution & Genomics
url http://f1000research.com/articles/5-1987/v1
work_keys_str_mv AT jasperjkoehorst proteindomainarchitecturesprovideafastefficientandscalablealternativetosequencebasedmethodsforcomparativefunctionalgenomicsversion1referees1approved2approvedwithreservations
AT edoardosaccenti proteindomainarchitecturesprovideafastefficientandscalablealternativetosequencebasedmethodsforcomparativefunctionalgenomicsversion1referees1approved2approvedwithreservations
AT peterjschaap proteindomainarchitecturesprovideafastefficientandscalablealternativetosequencebasedmethodsforcomparativefunctionalgenomicsversion1referees1approved2approvedwithreservations
AT vitorapmartinsdossantos proteindomainarchitecturesprovideafastefficientandscalablealternativetosequencebasedmethodsforcomparativefunctionalgenomicsversion1referees1approved2approvedwithreservations
AT mariasuarezdiez proteindomainarchitecturesprovideafastefficientandscalablealternativetosequencebasedmethodsforcomparativefunctionalgenomicsversion1referees1approved2approvedwithreservations