MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data

While metagenome sequencing may provide insights on the genome sequences and composition of microbial communities, metatranscriptome analysis can be useful for studying the functional activity of a microbiome. RNA-Seq data provides the possibility to determine active genes in the community and how t...

Full description

Bibliographic Details
Main Authors: Daria Shafranskaya, Varsha Kale, Rob Finn, Alla L. Lapidus, Anton Korobeynikov, Andrey D. Prjibelski
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-10-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmicb.2022.981458/full
_version_ 1811335969130938368
author Daria Shafranskaya
Varsha Kale
Rob Finn
Alla L. Lapidus
Anton Korobeynikov
Andrey D. Prjibelski
Andrey D. Prjibelski
author_facet Daria Shafranskaya
Varsha Kale
Rob Finn
Alla L. Lapidus
Anton Korobeynikov
Andrey D. Prjibelski
Andrey D. Prjibelski
author_sort Daria Shafranskaya
collection DOAJ
description While metagenome sequencing may provide insights on the genome sequences and composition of microbial communities, metatranscriptome analysis can be useful for studying the functional activity of a microbiome. RNA-Seq data provides the possibility to determine active genes in the community and how their expression levels depend on external conditions. Although the field of metatranscriptomics is relatively young, the number of projects related to metatranscriptome analysis increases every year and the scope of its applications expands. However, there are several problems that complicate metatranscriptome analysis: complexity of microbial communities, wide dynamic range of transcriptome expression and importantly, the lack of high-quality computational methods for assembling meta-RNA sequencing data. These factors deteriorate the contiguity and completeness of metatranscriptome assemblies, therefore affecting further downstream analysis.Here we present MetaGT, a pipeline for de novo assembly of metatranscriptomes, which is based on the idea of combining both metatranscriptomic and metagenomic data sequenced from the same sample. MetaGT assembles metatranscriptomic contigs and fills in missing regions based on their alignments to metagenome assembly. This approach allows to overcome described complexities and obtain complete RNA sequences, and additionally estimate their abundances. Using various publicly available real and simulated datasets, we demonstrate that MetaGT yields significant improvement in coverage and completeness of metatranscriptome assemblies compared to existing methods that do not exploit metagenomic data. The pipeline is implemented in NextFlow and is freely available from https://github.com/ablab/metaGT.
first_indexed 2024-04-13T17:32:22Z
format Article
id doaj.art-9ef2074a7aee4e6bb073df2d16b72d7e
institution Directory Open Access Journal
issn 1664-302X
language English
last_indexed 2024-04-13T17:32:22Z
publishDate 2022-10-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Microbiology
spelling doaj.art-9ef2074a7aee4e6bb073df2d16b72d7e2022-12-22T02:37:31ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2022-10-011310.3389/fmicb.2022.981458981458MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic dataDaria Shafranskaya0Varsha Kale1Rob Finn2Alla L. Lapidus3Anton Korobeynikov4Andrey D. Prjibelski5Andrey D. Prjibelski6Center for Algorithmic Biotechnology, Saint Petersburg State University, Saint Petersburg, RussiaEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United KingdomEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United KingdomCenter for Algorithmic Biotechnology, Saint Petersburg State University, Saint Petersburg, RussiaCenter for Algorithmic Biotechnology, Saint Petersburg State University, Saint Petersburg, RussiaCenter for Algorithmic Biotechnology, Saint Petersburg State University, Saint Petersburg, RussiaDepartment of Computer Science, University of Helsinki, Helsinki, FinlandWhile metagenome sequencing may provide insights on the genome sequences and composition of microbial communities, metatranscriptome analysis can be useful for studying the functional activity of a microbiome. RNA-Seq data provides the possibility to determine active genes in the community and how their expression levels depend on external conditions. Although the field of metatranscriptomics is relatively young, the number of projects related to metatranscriptome analysis increases every year and the scope of its applications expands. However, there are several problems that complicate metatranscriptome analysis: complexity of microbial communities, wide dynamic range of transcriptome expression and importantly, the lack of high-quality computational methods for assembling meta-RNA sequencing data. These factors deteriorate the contiguity and completeness of metatranscriptome assemblies, therefore affecting further downstream analysis.Here we present MetaGT, a pipeline for de novo assembly of metatranscriptomes, which is based on the idea of combining both metatranscriptomic and metagenomic data sequenced from the same sample. MetaGT assembles metatranscriptomic contigs and fills in missing regions based on their alignments to metagenome assembly. This approach allows to overcome described complexities and obtain complete RNA sequences, and additionally estimate their abundances. Using various publicly available real and simulated datasets, we demonstrate that MetaGT yields significant improvement in coverage and completeness of metatranscriptome assemblies compared to existing methods that do not exploit metagenomic data. The pipeline is implemented in NextFlow and is freely available from https://github.com/ablab/metaGT.https://www.frontiersin.org/articles/10.3389/fmicb.2022.981458/fullmetatranscriptomicsmetagenomicsRNA-Seqde novo assemblycomputational pipeline
spellingShingle Daria Shafranskaya
Varsha Kale
Rob Finn
Alla L. Lapidus
Anton Korobeynikov
Andrey D. Prjibelski
Andrey D. Prjibelski
MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data
Frontiers in Microbiology
metatranscriptomics
metagenomics
RNA-Seq
de novo assembly
computational pipeline
title MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data
title_full MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data
title_fullStr MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data
title_full_unstemmed MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data
title_short MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data
title_sort metagt a pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data
topic metatranscriptomics
metagenomics
RNA-Seq
de novo assembly
computational pipeline
url https://www.frontiersin.org/articles/10.3389/fmicb.2022.981458/full
work_keys_str_mv AT dariashafranskaya metagtapipelinefordenovoassemblyofmetatranscriptomeswiththeaidofmetagenomicdata
AT varshakale metagtapipelinefordenovoassemblyofmetatranscriptomeswiththeaidofmetagenomicdata
AT robfinn metagtapipelinefordenovoassemblyofmetatranscriptomeswiththeaidofmetagenomicdata
AT allallapidus metagtapipelinefordenovoassemblyofmetatranscriptomeswiththeaidofmetagenomicdata
AT antonkorobeynikov metagtapipelinefordenovoassemblyofmetatranscriptomeswiththeaidofmetagenomicdata
AT andreydprjibelski metagtapipelinefordenovoassemblyofmetatranscriptomeswiththeaidofmetagenomicdata
AT andreydprjibelski metagtapipelinefordenovoassemblyofmetatranscriptomeswiththeaidofmetagenomicdata