GTax: improving de novo transcriptome assembly by removing foreign RNA contamination
Abstract The cost and complexity of generating a complete reference genome means that many organisms lack an annotated reference. An alternative is to use a de novo reference transcriptome. This technology is cost-effective but is susceptible to off-target RNA contamination. In this manuscript, we p...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-01-01
|
Series: | Genome Biology |
Online Access: | https://doi.org/10.1186/s13059-023-03141-2 |
_version_ | 1827382116963319808 |
---|---|
author | Roberto Vera Alvarez David Landsman |
author_facet | Roberto Vera Alvarez David Landsman |
author_sort | Roberto Vera Alvarez |
collection | DOAJ |
description | Abstract The cost and complexity of generating a complete reference genome means that many organisms lack an annotated reference. An alternative is to use a de novo reference transcriptome. This technology is cost-effective but is susceptible to off-target RNA contamination. In this manuscript, we present GTax, a taxonomy-structured database of genomic sequences that can be used with BLAST to detect and remove foreign contamination in RNA sequencing samples before assembly. In addition, we use a de novo transcriptome assembly of Solanum lycopersicum (tomato) to demonstrate that removing foreign contamination in sequencing samples reduces the number of assembled chimeric transcripts. |
first_indexed | 2024-03-08T14:14:54Z |
format | Article |
id | doaj.art-2c915c63d4ab493b8cc65bce4cdf2ca6 |
institution | Directory Open Access Journal |
issn | 1474-760X |
language | English |
last_indexed | 2024-03-08T14:14:54Z |
publishDate | 2024-01-01 |
publisher | BMC |
record_format | Article |
series | Genome Biology |
spelling | doaj.art-2c915c63d4ab493b8cc65bce4cdf2ca62024-01-14T12:25:08ZengBMCGenome Biology1474-760X2024-01-0125112110.1186/s13059-023-03141-2GTax: improving de novo transcriptome assembly by removing foreign RNA contaminationRoberto Vera Alvarez0David Landsman1Computational Biology Branch, National Center for Biotechnology Information, Intramural Research Program, National Library of Medicine, NIHComputational Biology Branch, National Center for Biotechnology Information, Intramural Research Program, National Library of Medicine, NIHAbstract The cost and complexity of generating a complete reference genome means that many organisms lack an annotated reference. An alternative is to use a de novo reference transcriptome. This technology is cost-effective but is susceptible to off-target RNA contamination. In this manuscript, we present GTax, a taxonomy-structured database of genomic sequences that can be used with BLAST to detect and remove foreign contamination in RNA sequencing samples before assembly. In addition, we use a de novo transcriptome assembly of Solanum lycopersicum (tomato) to demonstrate that removing foreign contamination in sequencing samples reduces the number of assembled chimeric transcripts.https://doi.org/10.1186/s13059-023-03141-2 |
spellingShingle | Roberto Vera Alvarez David Landsman GTax: improving de novo transcriptome assembly by removing foreign RNA contamination Genome Biology |
title | GTax: improving de novo transcriptome assembly by removing foreign RNA contamination |
title_full | GTax: improving de novo transcriptome assembly by removing foreign RNA contamination |
title_fullStr | GTax: improving de novo transcriptome assembly by removing foreign RNA contamination |
title_full_unstemmed | GTax: improving de novo transcriptome assembly by removing foreign RNA contamination |
title_short | GTax: improving de novo transcriptome assembly by removing foreign RNA contamination |
title_sort | gtax improving de novo transcriptome assembly by removing foreign rna contamination |
url | https://doi.org/10.1186/s13059-023-03141-2 |
work_keys_str_mv | AT robertoveraalvarez gtaximprovingdenovotranscriptomeassemblybyremovingforeignrnacontamination AT davidlandsman gtaximprovingdenovotranscriptomeassemblybyremovingforeignrnacontamination |