CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters

Abstract Background Co-localized sets of genes that encode specialized functions are common across microbial genomes and occur in genomes of larger eukaryotes as well. Important examples include Biosynthetic Gene Clusters (BGCs) that produce specialized metabolites with medicinal, agricultural, and...

Full description

Bibliographic Details
Main Authors: Matthias van den Belt, Cameron Gilchrist, Thomas J. Booth, Yit-Heng Chooi, Marnix H. Medema, Mohammad Alanjary
Format: Article
Language:English
Published: BMC 2023-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-023-05311-2
_version_ 1797831911596359680
author Matthias van den Belt
Cameron Gilchrist
Thomas J. Booth
Yit-Heng Chooi
Marnix H. Medema
Mohammad Alanjary
author_facet Matthias van den Belt
Cameron Gilchrist
Thomas J. Booth
Yit-Heng Chooi
Marnix H. Medema
Mohammad Alanjary
author_sort Matthias van den Belt
collection DOAJ
description Abstract Background Co-localized sets of genes that encode specialized functions are common across microbial genomes and occur in genomes of larger eukaryotes as well. Important examples include Biosynthetic Gene Clusters (BGCs) that produce specialized metabolites with medicinal, agricultural, and industrial value (e.g. antimicrobials). Comparative analysis of BGCs can aid in the discovery of novel metabolites by highlighting distribution and identifying variants in public genomes. Unfortunately, gene-cluster-level homology detection remains inaccessible, time-consuming and difficult to interpret. Results The comparative gene cluster analysis toolbox (CAGECAT) is a rapid and user-friendly platform to mitigate difficulties in comparative analysis of whole gene clusters. The software provides homology searches and downstream analyses without the need for command-line or programming expertise. By leveraging remote BLAST databases, which always provide up-to-date results, CAGECAT can yield relevant matches that aid in the comparison, taxonomic distribution, or evolution of an unknown query. The service is extensible and interoperable and implements the cblaster and clinker pipelines to perform homology search, filtering, gene neighbourhood estimation, and dynamic visualisation of resulting variant BGCs. With the visualisation module, publication-quality figures can be customized directly from a web-browser, which greatly accelerates their interpretation via informative overlays to identify conserved genes in a BGC query. Conclusion Overall, CAGECAT is an extensible software that can be interfaced via a standard web-browser for whole region homology searches and comparison on continually updated genomes from NCBI. The public web server and installable docker image are open source and freely available without registration at: https://cagecat.bioinformatics.nl .
first_indexed 2024-04-09T13:59:20Z
format Article
id doaj.art-ce3ee01c945f4592902421521f44c479
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-09T13:59:20Z
publishDate 2023-05-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-ce3ee01c945f4592902421521f44c4792023-05-07T11:25:48ZengBMCBMC Bioinformatics1471-21052023-05-012411810.1186/s12859-023-05311-2CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clustersMatthias van den Belt0Cameron Gilchrist1Thomas J. Booth2Yit-Heng Chooi3Marnix H. Medema4Mohammad Alanjary5Bioinformatics Group, Wageningen University and ResearchSchool of Molecular Sciences, The University of Western AustraliaSchool of Molecular Sciences, The University of Western AustraliaSchool of Molecular Sciences, The University of Western AustraliaBioinformatics Group, Wageningen University and ResearchBioinformatics Group, Wageningen University and ResearchAbstract Background Co-localized sets of genes that encode specialized functions are common across microbial genomes and occur in genomes of larger eukaryotes as well. Important examples include Biosynthetic Gene Clusters (BGCs) that produce specialized metabolites with medicinal, agricultural, and industrial value (e.g. antimicrobials). Comparative analysis of BGCs can aid in the discovery of novel metabolites by highlighting distribution and identifying variants in public genomes. Unfortunately, gene-cluster-level homology detection remains inaccessible, time-consuming and difficult to interpret. Results The comparative gene cluster analysis toolbox (CAGECAT) is a rapid and user-friendly platform to mitigate difficulties in comparative analysis of whole gene clusters. The software provides homology searches and downstream analyses without the need for command-line or programming expertise. By leveraging remote BLAST databases, which always provide up-to-date results, CAGECAT can yield relevant matches that aid in the comparison, taxonomic distribution, or evolution of an unknown query. The service is extensible and interoperable and implements the cblaster and clinker pipelines to perform homology search, filtering, gene neighbourhood estimation, and dynamic visualisation of resulting variant BGCs. With the visualisation module, publication-quality figures can be customized directly from a web-browser, which greatly accelerates their interpretation via informative overlays to identify conserved genes in a BGC query. Conclusion Overall, CAGECAT is an extensible software that can be interfaced via a standard web-browser for whole region homology searches and comparison on continually updated genomes from NCBI. The public web server and installable docker image are open source and freely available without registration at: https://cagecat.bioinformatics.nl .https://doi.org/10.1186/s12859-023-05311-2Gene clusterSecondary metaboliteHomology searchColocalizedBiosyntheticComparative analysis
spellingShingle Matthias van den Belt
Cameron Gilchrist
Thomas J. Booth
Yit-Heng Chooi
Marnix H. Medema
Mohammad Alanjary
CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters
BMC Bioinformatics
Gene cluster
Secondary metabolite
Homology search
Colocalized
Biosynthetic
Comparative analysis
title CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters
title_full CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters
title_fullStr CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters
title_full_unstemmed CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters
title_short CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters
title_sort cagecat the comparative gene cluster analysis toolbox for rapid search and visualisation of homologous gene clusters
topic Gene cluster
Secondary metabolite
Homology search
Colocalized
Biosynthetic
Comparative analysis
url https://doi.org/10.1186/s12859-023-05311-2
work_keys_str_mv AT matthiasvandenbelt cagecatthecomparativegeneclusteranalysistoolboxforrapidsearchandvisualisationofhomologousgeneclusters
AT camerongilchrist cagecatthecomparativegeneclusteranalysistoolboxforrapidsearchandvisualisationofhomologousgeneclusters
AT thomasjbooth cagecatthecomparativegeneclusteranalysistoolboxforrapidsearchandvisualisationofhomologousgeneclusters
AT yithengchooi cagecatthecomparativegeneclusteranalysistoolboxforrapidsearchandvisualisationofhomologousgeneclusters
AT marnixhmedema cagecatthecomparativegeneclusteranalysistoolboxforrapidsearchandvisualisationofhomologousgeneclusters
AT mohammadalanjary cagecatthecomparativegeneclusteranalysistoolboxforrapidsearchandvisualisationofhomologousgeneclusters