ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions

<p>Abstract</p> <p>Background</p> <p><it>In vivo </it>detection of protein-bound genomic regions can be achieved by combining chromatin-immunoprecipitation with next-generation sequencing technology (ChIP-seq). The large amount of sequence data produced by t...

Full description

Bibliographic Details
Main Authors: van Ham Roeland CHJ, Kaufmann Kerstin, Muiño Jose M, Angenent Gerco C, Krajewski Pawel
Format: Article
Language:English
Published: BMC 2011-05-01
Series:Plant Methods
Online Access:http://www.plantmethods.com/content/7/1/11
_version_ 1818472656798744576
author van Ham Roeland CHJ
Kaufmann Kerstin
Muiño Jose M
Angenent Gerco C
Krajewski Pawel
author_facet van Ham Roeland CHJ
Kaufmann Kerstin
Muiño Jose M
Angenent Gerco C
Krajewski Pawel
author_sort van Ham Roeland CHJ
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p><it>In vivo </it>detection of protein-bound genomic regions can be achieved by combining chromatin-immunoprecipitation with next-generation sequencing technology (ChIP-seq). The large amount of sequence data produced by this method needs to be analyzed in a statistically proper and computationally efficient manner. The generation of high copy numbers of DNA fragments as an artifact of the PCR step in ChIP-seq is an important source of bias of this methodology.</p> <p>Results</p> <p>We present here an R package for the statistical analysis of ChIP-seq experiments. Taking the average size of DNA fragments subjected to sequencing into account, the software calculates single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the ratio test or the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutations. Computational efficiency is achieved by implementing the most time-consuming functions in C++ and integrating these in the R package. An analysis of simulated and experimental ChIP-seq data is presented to demonstrate the robustness of our method against PCR-artefacts and its adequate control of the error rate.</p> <p>Conclusions</p> <p>The software <it>ChIP-seq Analysis in R </it>(CSAR) enables fast and accurate detection of protein-bound genomic regions through the analysis of ChIP-seq experiments. Compared to existing methods, we found that our package shows greater robustness against PCR-artefacts and better control of the error rate.</p>
first_indexed 2024-04-14T04:11:36Z
format Article
id doaj.art-42b71f04069d492cb488d15cd76f5db3
institution Directory Open Access Journal
issn 1746-4811
language English
last_indexed 2024-04-14T04:11:36Z
publishDate 2011-05-01
publisher BMC
record_format Article
series Plant Methods
spelling doaj.art-42b71f04069d492cb488d15cd76f5db32022-12-22T02:13:08ZengBMCPlant Methods1746-48112011-05-01711110.1186/1746-4811-7-11ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regionsvan Ham Roeland CHJKaufmann KerstinMuiño Jose MAngenent Gerco CKrajewski Pawel<p>Abstract</p> <p>Background</p> <p><it>In vivo </it>detection of protein-bound genomic regions can be achieved by combining chromatin-immunoprecipitation with next-generation sequencing technology (ChIP-seq). The large amount of sequence data produced by this method needs to be analyzed in a statistically proper and computationally efficient manner. The generation of high copy numbers of DNA fragments as an artifact of the PCR step in ChIP-seq is an important source of bias of this methodology.</p> <p>Results</p> <p>We present here an R package for the statistical analysis of ChIP-seq experiments. Taking the average size of DNA fragments subjected to sequencing into account, the software calculates single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the ratio test or the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutations. Computational efficiency is achieved by implementing the most time-consuming functions in C++ and integrating these in the R package. An analysis of simulated and experimental ChIP-seq data is presented to demonstrate the robustness of our method against PCR-artefacts and its adequate control of the error rate.</p> <p>Conclusions</p> <p>The software <it>ChIP-seq Analysis in R </it>(CSAR) enables fast and accurate detection of protein-bound genomic regions through the analysis of ChIP-seq experiments. Compared to existing methods, we found that our package shows greater robustness against PCR-artefacts and better control of the error rate.</p>http://www.plantmethods.com/content/7/1/11
spellingShingle van Ham Roeland CHJ
Kaufmann Kerstin
Muiño Jose M
Angenent Gerco C
Krajewski Pawel
ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions
Plant Methods
title ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions
title_full ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions
title_fullStr ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions
title_full_unstemmed ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions
title_short ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions
title_sort chip seq analysis in r csar an r package for the statistical detection of protein bound genomic regions
url http://www.plantmethods.com/content/7/1/11
work_keys_str_mv AT vanhamroelandchj chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions
AT kaufmannkerstin chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions
AT muinojosem chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions
AT angenentgercoc chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions
AT krajewskipawel chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions