nPoRe: n-polymer realigner for improved pileup-based variant calling

Abstract Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow...

Full description

Bibliographic Details
Main Authors: Tim Dunn, David Blaauw, Reetuparna Das, Satish Narayanasamy
Format: Article
Language:English
Published: BMC 2023-03-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-023-05193-4
_version_ 1797863363894575104
author Tim Dunn
David Blaauw
Reetuparna Das
Satish Narayanasamy
author_facet Tim Dunn
David Blaauw
Reetuparna Das
Satish Narayanasamy
author_sort Tim Dunn
collection DOAJ
description Abstract Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow cells. We show that read phasing and realignment can recover a significant portion of false negative INDELs. In particular, we extend Needleman-Wunsch affine gap alignment by introducing new gap penalties for more accurately aligning repeated n-polymer sequences such as homopolymers ( $$n=1$$ n = 1 ) and tandem repeats ( $$2 \le n \le 6$$ 2 ≤ n ≤ 6 ). At the same precision, haplotype phasing improves INDEL recall from 63.76 to $$70.66\%$$ 70.66 % and nPoRe realignment improves it further to $$73.04\%$$ 73.04 % .
first_indexed 2024-04-09T22:35:29Z
format Article
id doaj.art-e46296e9a9d4483fb78bf5a83e34f93e
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-09T22:35:29Z
publishDate 2023-03-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-e46296e9a9d4483fb78bf5a83e34f93e2023-03-22T12:33:13ZengBMCBMC Bioinformatics1471-21052023-03-0124112110.1186/s12859-023-05193-4nPoRe: n-polymer realigner for improved pileup-based variant callingTim Dunn0David Blaauw1Reetuparna Das2Satish Narayanasamy3University of MichiganUniversity of MichiganUniversity of MichiganUniversity of MichiganAbstract Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow cells. We show that read phasing and realignment can recover a significant portion of false negative INDELs. In particular, we extend Needleman-Wunsch affine gap alignment by introducing new gap penalties for more accurately aligning repeated n-polymer sequences such as homopolymers ( $$n=1$$ n = 1 ) and tandem repeats ( $$2 \le n \le 6$$ 2 ≤ n ≤ 6 ). At the same precision, haplotype phasing improves INDEL recall from 63.76 to $$70.66\%$$ 70.66 % and nPoRe realignment improves it further to $$73.04\%$$ 73.04 % .https://doi.org/10.1186/s12859-023-05193-4Germline variant callingAlignmentN-polymerHomopolymerShort tandem repeatCopy number
spellingShingle Tim Dunn
David Blaauw
Reetuparna Das
Satish Narayanasamy
nPoRe: n-polymer realigner for improved pileup-based variant calling
BMC Bioinformatics
Germline variant calling
Alignment
N-polymer
Homopolymer
Short tandem repeat
Copy number
title nPoRe: n-polymer realigner for improved pileup-based variant calling
title_full nPoRe: n-polymer realigner for improved pileup-based variant calling
title_fullStr nPoRe: n-polymer realigner for improved pileup-based variant calling
title_full_unstemmed nPoRe: n-polymer realigner for improved pileup-based variant calling
title_short nPoRe: n-polymer realigner for improved pileup-based variant calling
title_sort npore n polymer realigner for improved pileup based variant calling
topic Germline variant calling
Alignment
N-polymer
Homopolymer
Short tandem repeat
Copy number
url https://doi.org/10.1186/s12859-023-05193-4
work_keys_str_mv AT timdunn nporenpolymerrealignerforimprovedpileupbasedvariantcalling
AT davidblaauw nporenpolymerrealignerforimprovedpileupbasedvariantcalling
AT reetuparnadas nporenpolymerrealignerforimprovedpileupbasedvariantcalling
AT satishnarayanasamy nporenpolymerrealignerforimprovedpileupbasedvariantcalling