LFastqC: A lossless non-reference-based FASTQ compressor.

The cost-effectiveness of next-generation sequencing (NGS) has led to the advancement of genomic research, thereby regularly generating a large amount of raw data that often requires efficient infrastructures such as data centers to manage the storage and transmission of such data. The generated NGS...

Full description

Bibliographic Details
Main Authors: Sultan Al Yami, Chun-Hsi Huang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2019-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0224806
_version_ 1818677949077913600
author Sultan Al Yami
Chun-Hsi Huang
author_facet Sultan Al Yami
Chun-Hsi Huang
author_sort Sultan Al Yami
collection DOAJ
description The cost-effectiveness of next-generation sequencing (NGS) has led to the advancement of genomic research, thereby regularly generating a large amount of raw data that often requires efficient infrastructures such as data centers to manage the storage and transmission of such data. The generated NGS data are highly redundant and need to be efficiently compressed to reduce the cost of storage space and transmission bandwidth. We present a lossless, non-reference-based FASTQ compression algorithm, known as LFastqC, an improvement over the LFQC tool, to address these issues. LFastqC is compared with several state-of-the-art compressors, and the results indicate that LFastqC achieves better compression ratios for important datasets such as the LS454, PacBio, and MinION. Moreover, LFastqC has a better compression and decompression speed than LFQC, which was previously the top-performing compression algorithm for the LS454 dataset. LFastqC is freely available at https://github.uconn.edu/sya12005/LFastqC.
first_indexed 2024-12-17T09:07:29Z
format Article
id doaj.art-a16cd1f2cd7c46079328194375dbb754
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-17T09:07:29Z
publishDate 2019-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-a16cd1f2cd7c46079328194375dbb7542022-12-21T21:55:24ZengPublic Library of Science (PLoS)PLoS ONE1932-62032019-01-011411e022480610.1371/journal.pone.0224806LFastqC: A lossless non-reference-based FASTQ compressor.Sultan Al YamiChun-Hsi HuangThe cost-effectiveness of next-generation sequencing (NGS) has led to the advancement of genomic research, thereby regularly generating a large amount of raw data that often requires efficient infrastructures such as data centers to manage the storage and transmission of such data. The generated NGS data are highly redundant and need to be efficiently compressed to reduce the cost of storage space and transmission bandwidth. We present a lossless, non-reference-based FASTQ compression algorithm, known as LFastqC, an improvement over the LFQC tool, to address these issues. LFastqC is compared with several state-of-the-art compressors, and the results indicate that LFastqC achieves better compression ratios for important datasets such as the LS454, PacBio, and MinION. Moreover, LFastqC has a better compression and decompression speed than LFQC, which was previously the top-performing compression algorithm for the LS454 dataset. LFastqC is freely available at https://github.uconn.edu/sya12005/LFastqC.https://doi.org/10.1371/journal.pone.0224806
spellingShingle Sultan Al Yami
Chun-Hsi Huang
LFastqC: A lossless non-reference-based FASTQ compressor.
PLoS ONE
title LFastqC: A lossless non-reference-based FASTQ compressor.
title_full LFastqC: A lossless non-reference-based FASTQ compressor.
title_fullStr LFastqC: A lossless non-reference-based FASTQ compressor.
title_full_unstemmed LFastqC: A lossless non-reference-based FASTQ compressor.
title_short LFastqC: A lossless non-reference-based FASTQ compressor.
title_sort lfastqc a lossless non reference based fastq compressor
url https://doi.org/10.1371/journal.pone.0224806
work_keys_str_mv AT sultanalyami lfastqcalosslessnonreferencebasedfastqcompressor
AT chunhsihuang lfastqcalosslessnonreferencebasedfastqcompressor