CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing

Background Long-read nanopore sequencing technology is of particular significance for taxonomic identification at or below the species level. For many environmental samples, the total extractable DNA is far below the current input requirements of nanopore sequencing, preventing “sam...

Full description

Bibliographic Details
Main Authors: Hachey, Julie, Ruvkun, Gary, Mojarro, Angel, Zuber, Maria, Carr, Christopher E.
Other Authors: Massachusetts Institute of Technology. Department of Earth, Atmospheric, and Planetary Sciences
Format: Article
Language:English
Published: BioMed Central 2018
Online Access:http://hdl.handle.net/1721.1/114584
https://orcid.org/0000-0003-4547-4747
https://orcid.org/0000-0003-2652-8017
_version_ 1811071358934712320
author Hachey, Julie
Ruvkun, Gary
Mojarro, Angel
Zuber, Maria
Carr, Christopher E.
author2 Massachusetts Institute of Technology. Department of Earth, Atmospheric, and Planetary Sciences
author_facet Massachusetts Institute of Technology. Department of Earth, Atmospheric, and Planetary Sciences
Hachey, Julie
Ruvkun, Gary
Mojarro, Angel
Zuber, Maria
Carr, Christopher E.
author_sort Hachey, Julie
collection MIT
description Background Long-read nanopore sequencing technology is of particular significance for taxonomic identification at or below the species level. For many environmental samples, the total extractable DNA is far below the current input requirements of nanopore sequencing, preventing “sample to sequence” metagenomics from low-biomass or recalcitrant samples. Results Here we address this problem by employing carrier sequencing, a method to sequence low-input DNA by preparing the target DNA with a genomic carrier to achieve ideal library preparation and sequencing stoichiometry without amplification. We then use CarrierSeq, a sequence analysis workflow to identify the low-input target reads from the genomic carrier. We tested CarrierSeq experimentally by sequencing from a combination of 0.2 ng Bacillus subtilis ATCC 6633 DNA in a background of 1000 ng Enterobacteria phage λ DNA. After filtering of carrier, low quality, and low complexity reads, we detected target reads (B. subtilis), contamination reads, and “high quality noise reads” (HQNRs) not mapping to the carrier, target or known lab contaminants. These reads appear to be artifacts of the nanopore sequencing process as they are associated with specific channels (pores). Conclusion By treating sequencing as a Poisson arrival process, we implement a statistical test to reject data from channels dominated by HQNRs while retaining low-input target reads.
first_indexed 2024-09-23T08:49:58Z
format Article
id mit-1721.1/114584
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T08:49:58Z
publishDate 2018
publisher BioMed Central
record_format dspace
spelling mit-1721.1/1145842024-05-15T02:15:50Z CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing Hachey, Julie Ruvkun, Gary Mojarro, Angel Zuber, Maria Carr, Christopher E. Massachusetts Institute of Technology. Department of Earth, Atmospheric, and Planetary Sciences Mojarro, Angel Zuber, Maria Carr, Christopher E Background Long-read nanopore sequencing technology is of particular significance for taxonomic identification at or below the species level. For many environmental samples, the total extractable DNA is far below the current input requirements of nanopore sequencing, preventing “sample to sequence” metagenomics from low-biomass or recalcitrant samples. Results Here we address this problem by employing carrier sequencing, a method to sequence low-input DNA by preparing the target DNA with a genomic carrier to achieve ideal library preparation and sequencing stoichiometry without amplification. We then use CarrierSeq, a sequence analysis workflow to identify the low-input target reads from the genomic carrier. We tested CarrierSeq experimentally by sequencing from a combination of 0.2 ng Bacillus subtilis ATCC 6633 DNA in a background of 1000 ng Enterobacteria phage λ DNA. After filtering of carrier, low quality, and low complexity reads, we detected target reads (B. subtilis), contamination reads, and “high quality noise reads” (HQNRs) not mapping to the carrier, target or known lab contaminants. These reads appear to be artifacts of the nanopore sequencing process as they are associated with specific channels (pores). Conclusion By treating sequencing as a Poisson arrival process, we implement a statistical test to reject data from channels dominated by HQNRs while retaining low-input target reads. United States. National Aeronautics and Space Administration (Award NNX15AF85G) 2018-04-06T14:14:46Z 2018-04-06T14:14:46Z 2018-03 2017-10 2018-04-01T12:59:11Z Article http://purl.org/eprint/type/JournalArticle 1471-2105 http://hdl.handle.net/1721.1/114584 Mojarro, Angel et al. "CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing." BMC Bioinformatics 19 (March 2018):108 © 2018 The Authors https://orcid.org/0000-0003-4547-4747 https://orcid.org/0000-0003-2652-8017 en http://dx.doi.org/10.1186/s12859-018-2124-3 BMC Bioinformatics Creative Commons Attribution http://creativecommons.org/licenses/by/4.0/ The Author(s). application/pdf BioMed Central BioMed Central
spellingShingle Hachey, Julie
Ruvkun, Gary
Mojarro, Angel
Zuber, Maria
Carr, Christopher E.
CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing
title CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing
title_full CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing
title_fullStr CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing
title_full_unstemmed CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing
title_short CarrierSeq: a sequence analysis workflow for low-input nanopore sequencing
title_sort carrierseq a sequence analysis workflow for low input nanopore sequencing
url http://hdl.handle.net/1721.1/114584
https://orcid.org/0000-0003-4547-4747
https://orcid.org/0000-0003-2652-8017
work_keys_str_mv AT hacheyjulie carrierseqasequenceanalysisworkflowforlowinputnanoporesequencing
AT ruvkungary carrierseqasequenceanalysisworkflowforlowinputnanoporesequencing
AT mojarroangel carrierseqasequenceanalysisworkflowforlowinputnanoporesequencing
AT zubermaria carrierseqasequenceanalysisworkflowforlowinputnanoporesequencing
AT carrchristophere carrierseqasequenceanalysisworkflowforlowinputnanoporesequencing