ATAC2GRN: optimized ATAC-seq and DNase1-seq pipelines for rapid and accurate genome regulatory network inference
Abstract Background Chromatin accessibility profiling assays such as ATAC-seq and DNase1-seq offer the opportunity to rapidly characterize the regulatory state of the genome at a single nucleotide resolution. Optimization of molecular protocols has enabled the molecular biologist to produce next-gen...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2018-07-01
|
Series: | BMC Genomics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12864-018-4943-z |
_version_ | 1818957142521020416 |
---|---|
author | Thomas J. F. Pranzatelli Drew G. Michael John A. Chiorini |
author_facet | Thomas J. F. Pranzatelli Drew G. Michael John A. Chiorini |
author_sort | Thomas J. F. Pranzatelli |
collection | DOAJ |
description | Abstract Background Chromatin accessibility profiling assays such as ATAC-seq and DNase1-seq offer the opportunity to rapidly characterize the regulatory state of the genome at a single nucleotide resolution. Optimization of molecular protocols has enabled the molecular biologist to produce next-generation sequencing libraries in several hours, leaving the analysis of sequencing data as the primary obstacle to wide-scale deployment of accessibility profiling assays. To address this obstacle we have developed an optimized and efficient pipeline for the analysis of ATAC-seq and DNase1-seq data. Results We executed a multi-dimensional grid-search on the NIH Biowulf supercomputing cluster to assess the impact of parameter selection on biological reproducibility and ChIP-seq recovery by analyzing 4560 pipeline configurations. Our analysis improved ChIP-seq recovery by 15% for ATAC-seq and 3% for DNase1-seq and determined that PCR duplicate removal improves biological reproducibility by 36% without significant costs in footprinting transcription factors. Our analyses of down sampled reads identified a point of diminishing returns for increased library sequencing depth, with 95% of the ChIP-seq data of a 200 million read footprinting library recovered by 160 million reads. Conclusions We present optimized ATAC-seq and DNase-seq pipelines in both Snakemake and bash formats as well as optimal sequencing depths for ATAC-seq and DNase-seq projects. The optimized ATAC-seq and DNase1-seq analysis pipelines, parameters, and ground-truth ChIP-seq datasets have been made available for deployment and future algorithmic profiling. |
first_indexed | 2024-12-20T11:05:09Z |
format | Article |
id | doaj.art-31ca8508c9564c709377008b05c142e4 |
institution | Directory Open Access Journal |
issn | 1471-2164 |
language | English |
last_indexed | 2024-12-20T11:05:09Z |
publishDate | 2018-07-01 |
publisher | BMC |
record_format | Article |
series | BMC Genomics |
spelling | doaj.art-31ca8508c9564c709377008b05c142e42022-12-21T19:42:52ZengBMCBMC Genomics1471-21642018-07-0119111310.1186/s12864-018-4943-zATAC2GRN: optimized ATAC-seq and DNase1-seq pipelines for rapid and accurate genome regulatory network inferenceThomas J. F. Pranzatelli0Drew G. Michael1John A. Chiorini2National Institute of Dental and Craniofacial Research, National Institutes of HealthNational Institute of Dental and Craniofacial Research, National Institutes of HealthNational Institute of Dental and Craniofacial Research, National Institutes of HealthAbstract Background Chromatin accessibility profiling assays such as ATAC-seq and DNase1-seq offer the opportunity to rapidly characterize the regulatory state of the genome at a single nucleotide resolution. Optimization of molecular protocols has enabled the molecular biologist to produce next-generation sequencing libraries in several hours, leaving the analysis of sequencing data as the primary obstacle to wide-scale deployment of accessibility profiling assays. To address this obstacle we have developed an optimized and efficient pipeline for the analysis of ATAC-seq and DNase1-seq data. Results We executed a multi-dimensional grid-search on the NIH Biowulf supercomputing cluster to assess the impact of parameter selection on biological reproducibility and ChIP-seq recovery by analyzing 4560 pipeline configurations. Our analysis improved ChIP-seq recovery by 15% for ATAC-seq and 3% for DNase1-seq and determined that PCR duplicate removal improves biological reproducibility by 36% without significant costs in footprinting transcription factors. Our analyses of down sampled reads identified a point of diminishing returns for increased library sequencing depth, with 95% of the ChIP-seq data of a 200 million read footprinting library recovered by 160 million reads. Conclusions We present optimized ATAC-seq and DNase-seq pipelines in both Snakemake and bash formats as well as optimal sequencing depths for ATAC-seq and DNase-seq projects. The optimized ATAC-seq and DNase1-seq analysis pipelines, parameters, and ground-truth ChIP-seq datasets have been made available for deployment and future algorithmic profiling.http://link.springer.com/article/10.1186/s12864-018-4943-zDNA footprintingPipelineATAC-seqDNase1-seqRegulationOptimization |
spellingShingle | Thomas J. F. Pranzatelli Drew G. Michael John A. Chiorini ATAC2GRN: optimized ATAC-seq and DNase1-seq pipelines for rapid and accurate genome regulatory network inference BMC Genomics DNA footprinting Pipeline ATAC-seq DNase1-seq Regulation Optimization |
title | ATAC2GRN: optimized ATAC-seq and DNase1-seq pipelines for rapid and accurate genome regulatory network inference |
title_full | ATAC2GRN: optimized ATAC-seq and DNase1-seq pipelines for rapid and accurate genome regulatory network inference |
title_fullStr | ATAC2GRN: optimized ATAC-seq and DNase1-seq pipelines for rapid and accurate genome regulatory network inference |
title_full_unstemmed | ATAC2GRN: optimized ATAC-seq and DNase1-seq pipelines for rapid and accurate genome regulatory network inference |
title_short | ATAC2GRN: optimized ATAC-seq and DNase1-seq pipelines for rapid and accurate genome regulatory network inference |
title_sort | atac2grn optimized atac seq and dnase1 seq pipelines for rapid and accurate genome regulatory network inference |
topic | DNA footprinting Pipeline ATAC-seq DNase1-seq Regulation Optimization |
url | http://link.springer.com/article/10.1186/s12864-018-4943-z |
work_keys_str_mv | AT thomasjfpranzatelli atac2grnoptimizedatacseqanddnase1seqpipelinesforrapidandaccurategenomeregulatorynetworkinference AT drewgmichael atac2grnoptimizedatacseqanddnase1seqpipelinesforrapidandaccurategenomeregulatorynetworkinference AT johnachiorini atac2grnoptimizedatacseqanddnase1seqpipelinesforrapidandaccurategenomeregulatorynetworkinference |