DistMap: a toolkit for distributed short read mapping on a Hadoop cluster.
With the rapid and steady increase of next generation sequencing data output, the mapping of short reads has become a major data analysis bottleneck. On a single computer, it can take several days to map the vast quantity of reads produced from a single Illumina HiSeq lane. In an attempt to ameliora...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2013-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC3751911?pdf=render |
_version_ | 1818247429365956608 |
---|---|
author | Ram Vinay Pandey Christian Schlötterer |
author_facet | Ram Vinay Pandey Christian Schlötterer |
author_sort | Ram Vinay Pandey |
collection | DOAJ |
description | With the rapid and steady increase of next generation sequencing data output, the mapping of short reads has become a major data analysis bottleneck. On a single computer, it can take several days to map the vast quantity of reads produced from a single Illumina HiSeq lane. In an attempt to ameliorate this bottleneck we present a new tool, DistMap - a modular, scalable and integrated workflow to map reads in the Hadoop distributed computing framework. DistMap is easy to use, currently supports nine different short read mapping tools and can be run on all Unix-based operating systems. It accepts reads in FASTQ format as input and provides mapped reads in a SAM/BAM format. DistMap supports both paired-end and single-end reads thereby allowing the mapping of read data produced by different sequencing platforms. DistMap is available from http://code.google.com/p/distmap/ |
first_indexed | 2024-12-12T15:04:34Z |
format | Article |
id | doaj.art-04391609688c45a5a76f4e8ee3e671f1 |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-12-12T15:04:34Z |
publishDate | 2013-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-04391609688c45a5a76f4e8ee3e671f12022-12-22T00:20:45ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0188e7261410.1371/journal.pone.0072614DistMap: a toolkit for distributed short read mapping on a Hadoop cluster.Ram Vinay PandeyChristian SchlöttererWith the rapid and steady increase of next generation sequencing data output, the mapping of short reads has become a major data analysis bottleneck. On a single computer, it can take several days to map the vast quantity of reads produced from a single Illumina HiSeq lane. In an attempt to ameliorate this bottleneck we present a new tool, DistMap - a modular, scalable and integrated workflow to map reads in the Hadoop distributed computing framework. DistMap is easy to use, currently supports nine different short read mapping tools and can be run on all Unix-based operating systems. It accepts reads in FASTQ format as input and provides mapped reads in a SAM/BAM format. DistMap supports both paired-end and single-end reads thereby allowing the mapping of read data produced by different sequencing platforms. DistMap is available from http://code.google.com/p/distmap/http://europepmc.org/articles/PMC3751911?pdf=render |
spellingShingle | Ram Vinay Pandey Christian Schlötterer DistMap: a toolkit for distributed short read mapping on a Hadoop cluster. PLoS ONE |
title | DistMap: a toolkit for distributed short read mapping on a Hadoop cluster. |
title_full | DistMap: a toolkit for distributed short read mapping on a Hadoop cluster. |
title_fullStr | DistMap: a toolkit for distributed short read mapping on a Hadoop cluster. |
title_full_unstemmed | DistMap: a toolkit for distributed short read mapping on a Hadoop cluster. |
title_short | DistMap: a toolkit for distributed short read mapping on a Hadoop cluster. |
title_sort | distmap a toolkit for distributed short read mapping on a hadoop cluster |
url | http://europepmc.org/articles/PMC3751911?pdf=render |
work_keys_str_mv | AT ramvinaypandey distmapatoolkitfordistributedshortreadmappingonahadoopcluster AT christianschlotterer distmapatoolkitfordistributedshortreadmappingonahadoopcluster |