Unlocking Short Read Sequencing for Metagenomics

Background Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved. Methodology/Principal Findings We describe an experimental an...

Full description

Bibliographic Details
Main Authors: Chisholm, Sallie (Penny), Rodrigue, Sebastien, Materna, Arne, Timberlake, Sonia Crago, Blackburn, Matthew C., Malmstrom, Rex R., Alm, Eric J.
Other Authors: Massachusetts Institute of Technology. Department of Biological Engineering
Format: Article
Language:en_US
Published: Public Library of Science 2010
Online Access:http://hdl.handle.net/1721.1/60369
https://orcid.org/0000-0001-8294-9364
_version_ 1826206270188683264
author Chisholm, Sallie (Penny)
Rodrigue, Sebastien
Materna, Arne
Timberlake, Sonia Crago
Blackburn, Matthew C.
Malmstrom, Rex R.
Alm, Eric J.
author2 Massachusetts Institute of Technology. Department of Biological Engineering
author_facet Massachusetts Institute of Technology. Department of Biological Engineering
Chisholm, Sallie (Penny)
Rodrigue, Sebastien
Materna, Arne
Timberlake, Sonia Crago
Blackburn, Matthew C.
Malmstrom, Rex R.
Alm, Eric J.
author_sort Chisholm, Sallie (Penny)
collection MIT
description Background Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved. Methodology/Principal Findings We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read. Conclusions/Significance This strategy is broadly applicable to sequencing applications that benefit from low-cost high-throughput sequencing, but require longer read lengths. We demonstrate that our approach enables metagenomic analyses using the Illumina Genome Analyzer, with low error rates, and at a fraction of the cost of pyrosequencing.
first_indexed 2024-09-23T13:26:47Z
format Article
id mit-1721.1/60369
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T13:26:47Z
publishDate 2010
publisher Public Library of Science
record_format dspace
spelling mit-1721.1/603692022-09-28T14:17:51Z Unlocking Short Read Sequencing for Metagenomics Chisholm, Sallie (Penny) Rodrigue, Sebastien Materna, Arne Timberlake, Sonia Crago Blackburn, Matthew C. Malmstrom, Rex R. Alm, Eric J. Massachusetts Institute of Technology. Department of Biological Engineering Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Chisholm, Sallie (Penny) Chisholm, Sallie (Penny) Rodrigue, Sebastien Materna, Arne Timberlake, Sonia Crago Blackburn, Matthew C. Malmstrom, Rex R. Alm, Eric J. Background Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved. Methodology/Principal Findings We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read. Conclusions/Significance This strategy is broadly applicable to sequencing applications that benefit from low-cost high-throughput sequencing, but require longer read lengths. We demonstrate that our approach enables metagenomic analyses using the Illumina Genome Analyzer, with low error rates, and at a fraction of the cost of pyrosequencing. Gordon and Betty Moore Foundation (Marine Microbiology Initiative) Center for Microbial Oceanography: Research and Education United States. Dept. of Energy (Genome-to-Life) Natural Sciences and Engineering Research Council of Canada Fonds québécois de la recherche sur la nature et les technologies 2010-12-22T20:22:59Z 2010-12-22T20:22:59Z 2010-07 2010-05 Article http://purl.org/eprint/type/JournalArticle 1932-6203 http://hdl.handle.net/1721.1/60369 Rodrigue, Sébastien et al. “Unlocking Short Read Sequencing for Metagenomics.” PLoS ONE 5.7 (2010): e11840. https://orcid.org/0000-0001-8294-9364 en_US http://dx.doi.org/10.1371/journal.pone.0011840 PLoS ONE Creative Commons Attribution http://creativecommons.org/licenses/by/2.5/ application/pdf Public Library of Science PLoS
spellingShingle Chisholm, Sallie (Penny)
Rodrigue, Sebastien
Materna, Arne
Timberlake, Sonia Crago
Blackburn, Matthew C.
Malmstrom, Rex R.
Alm, Eric J.
Unlocking Short Read Sequencing for Metagenomics
title Unlocking Short Read Sequencing for Metagenomics
title_full Unlocking Short Read Sequencing for Metagenomics
title_fullStr Unlocking Short Read Sequencing for Metagenomics
title_full_unstemmed Unlocking Short Read Sequencing for Metagenomics
title_short Unlocking Short Read Sequencing for Metagenomics
title_sort unlocking short read sequencing for metagenomics
url http://hdl.handle.net/1721.1/60369
https://orcid.org/0000-0001-8294-9364
work_keys_str_mv AT chisholmsalliepenny unlockingshortreadsequencingformetagenomics
AT rodriguesebastien unlockingshortreadsequencingformetagenomics
AT maternaarne unlockingshortreadsequencingformetagenomics
AT timberlakesoniacrago unlockingshortreadsequencingformetagenomics
AT blackburnmatthewc unlockingshortreadsequencingformetagenomics
AT malmstromrexr unlockingshortreadsequencingformetagenomics
AT almericj unlockingshortreadsequencingformetagenomics