Dynamic construction of pan-genome subgraphs
Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph as a description of a pan-genome, comprising the genomes of many individuals/strains of the same or closely related species. Subsequent work improved the construction of the compressed de Bruijn graph in terms of run-ti...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
De Gruyter
2020-04-01
|
Series: | Open Computer Science |
Subjects: | |
Online Access: | https://doi.org/10.1515/comp-2020-0018 |
_version_ | 1818579544475435008 |
---|---|
author | Dede Kadir Ohlebusch Enno |
author_facet | Dede Kadir Ohlebusch Enno |
author_sort | Dede Kadir |
collection | DOAJ |
description | Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph as a description of a pan-genome, comprising the genomes of many individuals/strains of the same or closely related species. Subsequent work improved the construction of the compressed de Bruijn graph in terms of run-time and memory consumption. According to the Computational Pan-Genomics Consortium (Briefings in Bioinformatics 2016), a pan-genome data structure should support the following functionality: “All information within a data structure should be easily accessible for human eyes by visualization support on different scales.” However, a pan-genome graph can have thousands to millions of nodes and such an amount of information is certainly not easily accessible for human eyes. Thus, the possibility to construct pangenome subgraphs on demand would be quite valuable. In this article, we use the space-efficient representation of the compressed de Bruijn graph devised by Beller and Ohle-busch (Algorithms for Molecular Biology 2016) to construct pan-genome subgraphs on the fly. The user can specify a region in one of the genomes and the software tool will build a subgraph that contains the path corresponding to that region and all paths that are in the neighborhood of that path. The size of the neighborhood can be controlled by the user. |
first_indexed | 2024-12-16T07:03:23Z |
format | Article |
id | doaj.art-0159213ca2d640b8ada2d0be7b299629 |
institution | Directory Open Access Journal |
issn | 2299-1093 |
language | English |
last_indexed | 2024-12-16T07:03:23Z |
publishDate | 2020-04-01 |
publisher | De Gruyter |
record_format | Article |
series | Open Computer Science |
spelling | doaj.art-0159213ca2d640b8ada2d0be7b2996292022-12-21T22:40:05ZengDe GruyterOpen Computer Science2299-10932020-04-01101829610.1515/comp-2020-0018comp-2020-0018Dynamic construction of pan-genome subgraphsDede Kadir0Ohlebusch Enno1Institute of Theoretical Computer Science, Ulm University, D-89069 Ulm, GermanyInstitute of Theoretical Computer Science, Ulm University, D-89069 Ulm, GermanyMarcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph as a description of a pan-genome, comprising the genomes of many individuals/strains of the same or closely related species. Subsequent work improved the construction of the compressed de Bruijn graph in terms of run-time and memory consumption. According to the Computational Pan-Genomics Consortium (Briefings in Bioinformatics 2016), a pan-genome data structure should support the following functionality: “All information within a data structure should be easily accessible for human eyes by visualization support on different scales.” However, a pan-genome graph can have thousands to millions of nodes and such an amount of information is certainly not easily accessible for human eyes. Thus, the possibility to construct pangenome subgraphs on demand would be quite valuable. In this article, we use the space-efficient representation of the compressed de Bruijn graph devised by Beller and Ohle-busch (Algorithms for Molecular Biology 2016) to construct pan-genome subgraphs on the fly. The user can specify a region in one of the genomes and the software tool will build a subgraph that contains the path corresponding to that region and all paths that are in the neighborhood of that path. The size of the neighborhood can be controlled by the user.https://doi.org/10.1515/comp-2020-0018compressed de bruijn graphburrows-wheeler transformbackward searchpan-genome analysis |
spellingShingle | Dede Kadir Ohlebusch Enno Dynamic construction of pan-genome subgraphs Open Computer Science compressed de bruijn graph burrows-wheeler transform backward search pan-genome analysis |
title | Dynamic construction of pan-genome subgraphs |
title_full | Dynamic construction of pan-genome subgraphs |
title_fullStr | Dynamic construction of pan-genome subgraphs |
title_full_unstemmed | Dynamic construction of pan-genome subgraphs |
title_short | Dynamic construction of pan-genome subgraphs |
title_sort | dynamic construction of pan genome subgraphs |
topic | compressed de bruijn graph burrows-wheeler transform backward search pan-genome analysis |
url | https://doi.org/10.1515/comp-2020-0018 |
work_keys_str_mv | AT dedekadir dynamicconstructionofpangenomesubgraphs AT ohlebuschenno dynamicconstructionofpangenomesubgraphs |