Short Sequence Aligner Benchmarking for Chromatin Research

Much of today’s molecular science revolves around next-generation sequencing. Frequently, the first step in analyzing such data is aligning sequencing reads to a reference genome. This step is often taken for granted, but any analysis downstream of the alignment will be affected by the aligner’s abi...

Full description

Bibliographic Details
Main Authors: John Lawrence Carter, Harlan Stevens, Perry G. Ridge, Steven Michael Johnson
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:https://www.mdpi.com/1422-0067/24/18/14074
_version_ 1797579697116151808
author John Lawrence Carter
Harlan Stevens
Perry G. Ridge
Steven Michael Johnson
author_facet John Lawrence Carter
Harlan Stevens
Perry G. Ridge
Steven Michael Johnson
author_sort John Lawrence Carter
collection DOAJ
description Much of today’s molecular science revolves around next-generation sequencing. Frequently, the first step in analyzing such data is aligning sequencing reads to a reference genome. This step is often taken for granted, but any analysis downstream of the alignment will be affected by the aligner’s ability to correctly map sequences. In most cases, for research into chromatin structure and nucleosome positioning, ATAC-seq, ChIP-seq, and MNase-seq experiments use short read lengths. How well aligners manage these reads is critical. Most aligner programs will output mapped reads and unmapped reads. However, from a biological point of view, reads will fall into one of three categories: correctly mapped, incorrectly mapped, and unmapped. While increased sequencing depth can often compensate for unmapped reads, incorrectly and correctly mapped reads appear algorithmically identical but can produce biologically significant alterations in the results. For this reason, we are benchmarking various alignment programs to determine their propensity to incorrectly map short reads. As short-read alignment is an important step in ATAC-seq, ChIP-seq, and MNase-seq experiments, caution should be taken in mapping reads to ensure that the most accurate conclusions can be made from the data generated. Our analysis is intended to help investigators new to the field pick the alignment program best suited for their experimental conditions. In general, the aligners we tested performed well. BWA, Bowtie2, and Chromap were all exceptionally accurate, and we recommend using them. Furthermore, we show that longer read lengths do in fact lead to more accurate mappings.
first_indexed 2024-03-10T22:40:47Z
format Article
id doaj.art-ec4231addba549eb8a2599a2fb3e2825
institution Directory Open Access Journal
issn 1661-6596
1422-0067
language English
last_indexed 2024-03-10T22:40:47Z
publishDate 2023-09-01
publisher MDPI AG
record_format Article
series International Journal of Molecular Sciences
spelling doaj.art-ec4231addba549eb8a2599a2fb3e28252023-11-19T11:07:30ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672023-09-0124181407410.3390/ijms241814074Short Sequence Aligner Benchmarking for Chromatin ResearchJohn Lawrence Carter0Harlan Stevens1Perry G. Ridge2Steven Michael Johnson3Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USADepartment of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USADepartment of Biology, Brigham Young University, Provo, UT 84602, USADepartment of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USAMuch of today’s molecular science revolves around next-generation sequencing. Frequently, the first step in analyzing such data is aligning sequencing reads to a reference genome. This step is often taken for granted, but any analysis downstream of the alignment will be affected by the aligner’s ability to correctly map sequences. In most cases, for research into chromatin structure and nucleosome positioning, ATAC-seq, ChIP-seq, and MNase-seq experiments use short read lengths. How well aligners manage these reads is critical. Most aligner programs will output mapped reads and unmapped reads. However, from a biological point of view, reads will fall into one of three categories: correctly mapped, incorrectly mapped, and unmapped. While increased sequencing depth can often compensate for unmapped reads, incorrectly and correctly mapped reads appear algorithmically identical but can produce biologically significant alterations in the results. For this reason, we are benchmarking various alignment programs to determine their propensity to incorrectly map short reads. As short-read alignment is an important step in ATAC-seq, ChIP-seq, and MNase-seq experiments, caution should be taken in mapping reads to ensure that the most accurate conclusions can be made from the data generated. Our analysis is intended to help investigators new to the field pick the alignment program best suited for their experimental conditions. In general, the aligners we tested performed well. BWA, Bowtie2, and Chromap were all exceptionally accurate, and we recommend using them. Furthermore, we show that longer read lengths do in fact lead to more accurate mappings.https://www.mdpi.com/1422-0067/24/18/14074alignment programsChIP-seqNGS
spellingShingle John Lawrence Carter
Harlan Stevens
Perry G. Ridge
Steven Michael Johnson
Short Sequence Aligner Benchmarking for Chromatin Research
International Journal of Molecular Sciences
alignment programs
ChIP-seq
NGS
title Short Sequence Aligner Benchmarking for Chromatin Research
title_full Short Sequence Aligner Benchmarking for Chromatin Research
title_fullStr Short Sequence Aligner Benchmarking for Chromatin Research
title_full_unstemmed Short Sequence Aligner Benchmarking for Chromatin Research
title_short Short Sequence Aligner Benchmarking for Chromatin Research
title_sort short sequence aligner benchmarking for chromatin research
topic alignment programs
ChIP-seq
NGS
url https://www.mdpi.com/1422-0067/24/18/14074
work_keys_str_mv AT johnlawrencecarter shortsequencealignerbenchmarkingforchromatinresearch
AT harlanstevens shortsequencealignerbenchmarkingforchromatinresearch
AT perrygridge shortsequencealignerbenchmarkingforchromatinresearch
AT stevenmichaeljohnson shortsequencealignerbenchmarkingforchromatinresearch