transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation

Abstract Background RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows...

Full description

Bibliographic Details
Main Authors: Timothy R. Fallon, Tereza Čalounová, Martin Mokrejš, Jing-Ke Weng, Tomáš Pluskal
Format: Article
Language:English
Published: BMC 2023-04-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-023-05254-8
_version_ 1797849863493255168
author Timothy R. Fallon
Tereza Čalounová
Martin Mokrejš
Jing-Ke Weng
Tomáš Pluskal
author_facet Timothy R. Fallon
Tereza Čalounová
Martin Mokrejš
Jing-Ke Weng
Tomáš Pluskal
author_sort Timothy R. Fallon
collection DOAJ
description Abstract Background RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software. Results Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware. Conclusions transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.
first_indexed 2024-04-09T18:51:04Z
format Article
id doaj.art-d93eded9aae04aa187bcae13e6236923
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-09T18:51:04Z
publishDate 2023-04-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-d93eded9aae04aa187bcae13e62369232023-04-09T11:28:28ZengBMCBMC Bioinformatics1471-21052023-04-0124111110.1186/s12859-023-05254-8transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotationTimothy R. Fallon0Tereza Čalounová1Martin Mokrejš2Jing-Ke Weng3Tomáš Pluskal4Scripps Institution of Oceanography, UC San DiegoInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesWhitehead Institute for Biomedical ResearchInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesAbstract Background RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software. Results Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware. Conclusions transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.https://doi.org/10.1186/s12859-023-05254-8De novo transcriptome assemblyRNA-seqNon-model organismsTranscriptome annotationDifferential expression analysisReproducible software
spellingShingle Timothy R. Fallon
Tereza Čalounová
Martin Mokrejš
Jing-Ke Weng
Tomáš Pluskal
transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation
BMC Bioinformatics
De novo transcriptome assembly
RNA-seq
Non-model organisms
Transcriptome annotation
Differential expression analysis
Reproducible software
title transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation
title_full transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation
title_fullStr transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation
title_full_unstemmed transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation
title_short transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation
title_sort transxpress a snakemake pipeline for streamlined de novo transcriptome assembly and annotation
topic De novo transcriptome assembly
RNA-seq
Non-model organisms
Transcriptome annotation
Differential expression analysis
Reproducible software
url https://doi.org/10.1186/s12859-023-05254-8
work_keys_str_mv AT timothyrfallon transxpressasnakemakepipelineforstreamlineddenovotranscriptomeassemblyandannotation
AT terezacalounova transxpressasnakemakepipelineforstreamlineddenovotranscriptomeassemblyandannotation
AT martinmokrejs transxpressasnakemakepipelineforstreamlineddenovotranscriptomeassemblyandannotation
AT jingkeweng transxpressasnakemakepipelineforstreamlineddenovotranscriptomeassemblyandannotation
AT tomaspluskal transxpressasnakemakepipelineforstreamlineddenovotranscriptomeassemblyandannotation