Universal Count Correction for High-Throughput Sequencing

We show that existing RNA-seq, DNase-seq, and ChIP-seq data exhibit overdispersed per-base read count distributions that are not matched to existing computational method assumptions. To compensate for this overdispersion we introduce a nonparametric and universal method for processing per-base seque...

Full description

Bibliographic Details
Main Authors: Hashimoto, Tatsunori Benjamin, Edwards, Matthew Douglas, Gifford, David K.
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Public Library of Science 2014
Online Access:http://hdl.handle.net/1721.1/86363
https://orcid.org/0000-0003-0521-5855
https://orcid.org/0000-0002-5845-748X
https://orcid.org/0000-0003-1709-4034
_version_ 1811074073225068544
author Hashimoto, Tatsunori Benjamin
Edwards, Matthew Douglas
Gifford, David K.
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Hashimoto, Tatsunori Benjamin
Edwards, Matthew Douglas
Gifford, David K.
author_sort Hashimoto, Tatsunori Benjamin
collection MIT
description We show that existing RNA-seq, DNase-seq, and ChIP-seq data exhibit overdispersed per-base read count distributions that are not matched to existing computational method assumptions. To compensate for this overdispersion we introduce a nonparametric and universal method for processing per-base sequencing read count data called Fixseq. We demonstrate that Fixseq substantially improves the performance of existing RNA-seq, DNase-seq, and ChIP-seq analysis tools when compared with existing alternatives.
first_indexed 2024-09-23T09:42:44Z
format Article
id mit-1721.1/86363
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T09:42:44Z
publishDate 2014
publisher Public Library of Science
record_format dspace
spelling mit-1721.1/863632022-09-30T16:20:53Z Universal Count Correction for High-Throughput Sequencing Hashimoto, Tatsunori Benjamin Edwards, Matthew Douglas Gifford, David K. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Hashimoto, Tatsunori Benjamin Edwards, Matthew Douglas Gifford, David K. We show that existing RNA-seq, DNase-seq, and ChIP-seq data exhibit overdispersed per-base read count distributions that are not matched to existing computational method assumptions. To compensate for this overdispersion we introduce a nonparametric and universal method for processing per-base sequencing read count data called Fixseq. We demonstrate that Fixseq substantially improves the performance of existing RNA-seq, DNase-seq, and ChIP-seq analysis tools when compared with existing alternatives. National Institutes of Health (U.S.) (NIH grant no. 5-U01-HG007037) National Science Foundation (U.S.) (NSF grant no. 0645960) Qatar Computing Research Institute 2014-05-02T14:50:09Z 2014-05-02T14:50:09Z 2014-03 Article http://purl.org/eprint/type/JournalArticle 1553-7358 http://hdl.handle.net/1721.1/86363 Hashimoto, Tatsunori B., Matthew D. Edwards, and David K. Gifford. “Universal Count Correction for High-Throughput Sequencing.” Edited by Alice Carolyn McHardy. PLoS Comput Biol 10, no. 3 (March 6, 2014): e1003494. https://orcid.org/0000-0003-0521-5855 https://orcid.org/0000-0002-5845-748X https://orcid.org/0000-0003-1709-4034 en_US http://dx.doi.org/10.1371/journal.pcbi.1003494 PLoS Computational Biology Creative Commons Attribution http://creativecommons.org/licenses/by/4.0/ application/pdf Public Library of Science PLoS
spellingShingle Hashimoto, Tatsunori Benjamin
Edwards, Matthew Douglas
Gifford, David K.
Universal Count Correction for High-Throughput Sequencing
title Universal Count Correction for High-Throughput Sequencing
title_full Universal Count Correction for High-Throughput Sequencing
title_fullStr Universal Count Correction for High-Throughput Sequencing
title_full_unstemmed Universal Count Correction for High-Throughput Sequencing
title_short Universal Count Correction for High-Throughput Sequencing
title_sort universal count correction for high throughput sequencing
url http://hdl.handle.net/1721.1/86363
https://orcid.org/0000-0003-0521-5855
https://orcid.org/0000-0002-5845-748X
https://orcid.org/0000-0003-1709-4034
work_keys_str_mv AT hashimototatsunoribenjamin universalcountcorrectionforhighthroughputsequencing
AT edwardsmatthewdouglas universalcountcorrectionforhighthroughputsequencing
AT gifforddavidk universalcountcorrectionforhighthroughputsequencing