RGFA: powerful and convenient handling of assembly graphs

The “Graphical Fragment Assembly” (GFA) is an emerging format for the representation of sequence assembly graphs, which can be adopted by both de Bruijn graph- and string graph-based assemblers. Here we present RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to...

Full description

Bibliographic Details
Main Authors: Giorgio Gonnella, Stefan Kurtz
Format: Article
Language:English
Published: PeerJ Inc. 2016-11-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/2681.pdf
_version_ 1797418930482970624
author Giorgio Gonnella
Stefan Kurtz
author_facet Giorgio Gonnella
Stefan Kurtz
author_sort Giorgio Gonnella
collection DOAJ
description The “Graphical Fragment Assembly” (GFA) is an emerging format for the representation of sequence assembly graphs, which can be adopted by both de Bruijn graph- and string graph-based assemblers. Here we present RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to conveniently parse, edit and write GFA files. Complex operations such as the separation of the implicit instances of repeats and the merging of linear paths can be performed. A typical application of RGFA is the editing of a graph, to finish the assembly of a sequence, using information not available to the assembler. We illustrate a use case, in which the assembly of a repetitive metagenomic fosmid insert was completed using a script based on RGFA. Furthermore, we show how the API provided by RGFA can be employed to design complex graph editing algorithms. As an example, we developed a detection algorithm for CRISPRs in a de Bruijn graph. Finally, RGFA can be used for comparing assembly graphs, e.g., to document the changes in a graph after applying a GUI editor. A program, GFAdiff is provided, which compares the information in two graphs, and generate a report or a Ruby script documenting the transformation steps between the graphs.
first_indexed 2024-03-09T06:41:11Z
format Article
id doaj.art-4785e9caf3574c008ff22bae7d6b36bb
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T06:41:11Z
publishDate 2016-11-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-4785e9caf3574c008ff22bae7d6b36bb2023-12-03T10:50:54ZengPeerJ Inc.PeerJ2167-83592016-11-014e268110.7717/peerj.2681RGFA: powerful and convenient handling of assembly graphsGiorgio Gonnella0Stefan Kurtz1Zentrum für Bioinformatik, Universität Hamburg, Hamburg, GermanyZentrum für Bioinformatik, Universität Hamburg, Hamburg, GermanyThe “Graphical Fragment Assembly” (GFA) is an emerging format for the representation of sequence assembly graphs, which can be adopted by both de Bruijn graph- and string graph-based assemblers. Here we present RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to conveniently parse, edit and write GFA files. Complex operations such as the separation of the implicit instances of repeats and the merging of linear paths can be performed. A typical application of RGFA is the editing of a graph, to finish the assembly of a sequence, using information not available to the assembler. We illustrate a use case, in which the assembly of a repetitive metagenomic fosmid insert was completed using a script based on RGFA. Furthermore, we show how the API provided by RGFA can be employed to design complex graph editing algorithms. As an example, we developed a detection algorithm for CRISPRs in a de Bruijn graph. Finally, RGFA can be used for comparing assembly graphs, e.g., to document the changes in a graph after applying a GUI editor. A program, GFAdiff is provided, which compares the information in two graphs, and generate a report or a Ruby script documenting the transformation steps between the graphs.https://peerj.com/articles/2681.pdfGFA formatSequence assemblingAssembly graphSoftware libraryGraphical Fragment AssemblyGraph transformation
spellingShingle Giorgio Gonnella
Stefan Kurtz
RGFA: powerful and convenient handling of assembly graphs
PeerJ
GFA format
Sequence assembling
Assembly graph
Software library
Graphical Fragment Assembly
Graph transformation
title RGFA: powerful and convenient handling of assembly graphs
title_full RGFA: powerful and convenient handling of assembly graphs
title_fullStr RGFA: powerful and convenient handling of assembly graphs
title_full_unstemmed RGFA: powerful and convenient handling of assembly graphs
title_short RGFA: powerful and convenient handling of assembly graphs
title_sort rgfa powerful and convenient handling of assembly graphs
topic GFA format
Sequence assembling
Assembly graph
Software library
Graphical Fragment Assembly
Graph transformation
url https://peerj.com/articles/2681.pdf
work_keys_str_mv AT giorgiogonnella rgfapowerfulandconvenienthandlingofassemblygraphs
AT stefankurtz rgfapowerfulandconvenienthandlingofassemblygraphs