Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments

Abstract Background Analysis of mixed microbial communities using metagenomic sequencing experiments requires multiple preprocessing and analytical steps to interpret the microbial and genetic composition of samples. Analytical steps include quality control, adapter trimming, host decontamination, m...

Full description

Bibliographic Details
Main Authors: Erik L. Clarke, Louis J. Taylor, Chunyu Zhao, Andrew Connell, Jung-Jin Lee, Bryton Fett, Frederic D. Bushman, Kyle Bittinger
Format: Article
Language:English
Published: BMC 2019-03-01
Series:Microbiome
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40168-019-0658-x
_version_ 1828404346659799040
author Erik L. Clarke
Louis J. Taylor
Chunyu Zhao
Andrew Connell
Jung-Jin Lee
Bryton Fett
Frederic D. Bushman
Kyle Bittinger
author_facet Erik L. Clarke
Louis J. Taylor
Chunyu Zhao
Andrew Connell
Jung-Jin Lee
Bryton Fett
Frederic D. Bushman
Kyle Bittinger
author_sort Erik L. Clarke
collection DOAJ
description Abstract Background Analysis of mixed microbial communities using metagenomic sequencing experiments requires multiple preprocessing and analytical steps to interpret the microbial and genetic composition of samples. Analytical steps include quality control, adapter trimming, host decontamination, metagenomic classification, read assembly, and alignment to reference genomes. Results We present a modular and user-extensible pipeline called Sunbeam that performs these steps in a consistent and reproducible fashion. It can be installed in a single step, does not require administrative access to the host computer system, and can work with most cluster computing frameworks. We also introduce Komplexity, a software tool to eliminate potentially problematic, low-complexity nucleotide sequences from metagenomic data. A unique component of the Sunbeam pipeline is an easy-to-use extension framework that enables users to add custom processing or analysis steps directly to the workflow. The pipeline and its extension framework are well documented, in routine use, and regularly updated. Conclusions Sunbeam provides a foundation to build more in-depth analyses and to enable comparisons in metagenomic sequencing experiments by removing problematic, low-complexity reads and standardizing post-processing and analytical steps. Sunbeam is written in Python using the Snakemake workflow management software and is freely available at github.com/sunbeam-labs/sunbeam under the GPLv3.
first_indexed 2024-12-10T10:34:46Z
format Article
id doaj.art-56cc1c964e0d4947b4e19e5831d07f81
institution Directory Open Access Journal
issn 2049-2618
language English
last_indexed 2024-12-10T10:34:46Z
publishDate 2019-03-01
publisher BMC
record_format Article
series Microbiome
spelling doaj.art-56cc1c964e0d4947b4e19e5831d07f812022-12-22T01:52:27ZengBMCMicrobiome2049-26182019-03-017111310.1186/s40168-019-0658-xSunbeam: an extensible pipeline for analyzing metagenomic sequencing experimentsErik L. Clarke0Louis J. Taylor1Chunyu Zhao2Andrew Connell3Jung-Jin Lee4Bryton Fett5Frederic D. Bushman6Kyle Bittinger7Department of Microbiology, University of PennsylvaniaDepartment of Microbiology, University of PennsylvaniaDivision of Gastroenterology, Hepatology and Nutrition, The Children’s Hospital of PhiladelphiaDepartment of Microbiology, University of PennsylvaniaDivision of Gastroenterology, Hepatology and Nutrition, The Children’s Hospital of PhiladelphiaDivision of Gastroenterology, Hepatology and Nutrition, The Children’s Hospital of PhiladelphiaDepartment of Microbiology, University of PennsylvaniaDivision of Gastroenterology, Hepatology and Nutrition, The Children’s Hospital of PhiladelphiaAbstract Background Analysis of mixed microbial communities using metagenomic sequencing experiments requires multiple preprocessing and analytical steps to interpret the microbial and genetic composition of samples. Analytical steps include quality control, adapter trimming, host decontamination, metagenomic classification, read assembly, and alignment to reference genomes. Results We present a modular and user-extensible pipeline called Sunbeam that performs these steps in a consistent and reproducible fashion. It can be installed in a single step, does not require administrative access to the host computer system, and can work with most cluster computing frameworks. We also introduce Komplexity, a software tool to eliminate potentially problematic, low-complexity nucleotide sequences from metagenomic data. A unique component of the Sunbeam pipeline is an easy-to-use extension framework that enables users to add custom processing or analysis steps directly to the workflow. The pipeline and its extension framework are well documented, in routine use, and regularly updated. Conclusions Sunbeam provides a foundation to build more in-depth analyses and to enable comparisons in metagenomic sequencing experiments by removing problematic, low-complexity reads and standardizing post-processing and analytical steps. Sunbeam is written in Python using the Snakemake workflow management software and is freely available at github.com/sunbeam-labs/sunbeam under the GPLv3.http://link.springer.com/article/10.1186/s40168-019-0658-xSunbeamShotgun metagenomic sequencingSoftwarePipelineQuality control
spellingShingle Erik L. Clarke
Louis J. Taylor
Chunyu Zhao
Andrew Connell
Jung-Jin Lee
Bryton Fett
Frederic D. Bushman
Kyle Bittinger
Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments
Microbiome
Sunbeam
Shotgun metagenomic sequencing
Software
Pipeline
Quality control
title Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments
title_full Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments
title_fullStr Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments
title_full_unstemmed Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments
title_short Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments
title_sort sunbeam an extensible pipeline for analyzing metagenomic sequencing experiments
topic Sunbeam
Shotgun metagenomic sequencing
Software
Pipeline
Quality control
url http://link.springer.com/article/10.1186/s40168-019-0658-x
work_keys_str_mv AT eriklclarke sunbeamanextensiblepipelineforanalyzingmetagenomicsequencingexperiments
AT louisjtaylor sunbeamanextensiblepipelineforanalyzingmetagenomicsequencingexperiments
AT chunyuzhao sunbeamanextensiblepipelineforanalyzingmetagenomicsequencingexperiments
AT andrewconnell sunbeamanextensiblepipelineforanalyzingmetagenomicsequencingexperiments
AT jungjinlee sunbeamanextensiblepipelineforanalyzingmetagenomicsequencingexperiments
AT brytonfett sunbeamanextensiblepipelineforanalyzingmetagenomicsequencingexperiments
AT fredericdbushman sunbeamanextensiblepipelineforanalyzingmetagenomicsequencingexperiments
AT kylebittinger sunbeamanextensiblepipelineforanalyzingmetagenomicsequencingexperiments