nPoRe: n-polymer realigner for improved pileup-based variant calling
Abstract Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2023-03-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-023-05193-4 |
_version_ | 1797863363894575104 |
---|---|
author | Tim Dunn David Blaauw Reetuparna Das Satish Narayanasamy |
author_facet | Tim Dunn David Blaauw Reetuparna Das Satish Narayanasamy |
author_sort | Tim Dunn |
collection | DOAJ |
description | Abstract Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow cells. We show that read phasing and realignment can recover a significant portion of false negative INDELs. In particular, we extend Needleman-Wunsch affine gap alignment by introducing new gap penalties for more accurately aligning repeated n-polymer sequences such as homopolymers ( $$n=1$$ n = 1 ) and tandem repeats ( $$2 \le n \le 6$$ 2 ≤ n ≤ 6 ). At the same precision, haplotype phasing improves INDEL recall from 63.76 to $$70.66\%$$ 70.66 % and nPoRe realignment improves it further to $$73.04\%$$ 73.04 % . |
first_indexed | 2024-04-09T22:35:29Z |
format | Article |
id | doaj.art-e46296e9a9d4483fb78bf5a83e34f93e |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-04-09T22:35:29Z |
publishDate | 2023-03-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-e46296e9a9d4483fb78bf5a83e34f93e2023-03-22T12:33:13ZengBMCBMC Bioinformatics1471-21052023-03-0124112110.1186/s12859-023-05193-4nPoRe: n-polymer realigner for improved pileup-based variant callingTim Dunn0David Blaauw1Reetuparna Das2Satish Narayanasamy3University of MichiganUniversity of MichiganUniversity of MichiganUniversity of MichiganAbstract Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow cells. We show that read phasing and realignment can recover a significant portion of false negative INDELs. In particular, we extend Needleman-Wunsch affine gap alignment by introducing new gap penalties for more accurately aligning repeated n-polymer sequences such as homopolymers ( $$n=1$$ n = 1 ) and tandem repeats ( $$2 \le n \le 6$$ 2 ≤ n ≤ 6 ). At the same precision, haplotype phasing improves INDEL recall from 63.76 to $$70.66\%$$ 70.66 % and nPoRe realignment improves it further to $$73.04\%$$ 73.04 % .https://doi.org/10.1186/s12859-023-05193-4Germline variant callingAlignmentN-polymerHomopolymerShort tandem repeatCopy number |
spellingShingle | Tim Dunn David Blaauw Reetuparna Das Satish Narayanasamy nPoRe: n-polymer realigner for improved pileup-based variant calling BMC Bioinformatics Germline variant calling Alignment N-polymer Homopolymer Short tandem repeat Copy number |
title | nPoRe: n-polymer realigner for improved pileup-based variant calling |
title_full | nPoRe: n-polymer realigner for improved pileup-based variant calling |
title_fullStr | nPoRe: n-polymer realigner for improved pileup-based variant calling |
title_full_unstemmed | nPoRe: n-polymer realigner for improved pileup-based variant calling |
title_short | nPoRe: n-polymer realigner for improved pileup-based variant calling |
title_sort | npore n polymer realigner for improved pileup based variant calling |
topic | Germline variant calling Alignment N-polymer Homopolymer Short tandem repeat Copy number |
url | https://doi.org/10.1186/s12859-023-05193-4 |
work_keys_str_mv | AT timdunn nporenpolymerrealignerforimprovedpileupbasedvariantcalling AT davidblaauw nporenpolymerrealignerforimprovedpileupbasedvariantcalling AT reetuparnadas nporenpolymerrealignerforimprovedpileupbasedvariantcalling AT satishnarayanasamy nporenpolymerrealignerforimprovedpileupbasedvariantcalling |