Identification of single nucleotide variants using position-specific error estimation in deep sequencing data

Abstract Background Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a chall...

Full description

Bibliographic Details
Main Authors: Dimitrios Kleftogiannis, Marco Punta, Anuradha Jayaram, Shahneen Sandhu, Stephen Q. Wong, Delila Gasi Tandefelt, Vincenza Conteduca, Daniel Wetterskog, Gerhardt Attard, Stefano Lise
Format: Article
Language:English
Published: BMC 2019-08-01
Series:BMC Medical Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12920-019-0557-9
_version_ 1819069238051078144
author Dimitrios Kleftogiannis
Marco Punta
Anuradha Jayaram
Shahneen Sandhu
Stephen Q. Wong
Delila Gasi Tandefelt
Vincenza Conteduca
Daniel Wetterskog
Gerhardt Attard
Stefano Lise
author_facet Dimitrios Kleftogiannis
Marco Punta
Anuradha Jayaram
Shahneen Sandhu
Stephen Q. Wong
Delila Gasi Tandefelt
Vincenza Conteduca
Daniel Wetterskog
Gerhardt Attard
Stefano Lise
author_sort Dimitrios Kleftogiannis
collection DOAJ
description Abstract Background Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs). Methods To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection. Results Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments. Conclusions AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve.
first_indexed 2024-12-21T16:46:52Z
format Article
id doaj.art-616254a8c8714710aa88f59e833b3a08
institution Directory Open Access Journal
issn 1755-8794
language English
last_indexed 2024-12-21T16:46:52Z
publishDate 2019-08-01
publisher BMC
record_format Article
series BMC Medical Genomics
spelling doaj.art-616254a8c8714710aa88f59e833b3a082022-12-21T18:56:58ZengBMCBMC Medical Genomics1755-87942019-08-0112111210.1186/s12920-019-0557-9Identification of single nucleotide variants using position-specific error estimation in deep sequencing dataDimitrios Kleftogiannis0Marco Punta1Anuradha Jayaram2Shahneen Sandhu3Stephen Q. Wong4Delila Gasi Tandefelt5Vincenza Conteduca6Daniel Wetterskog7Gerhardt Attard8Stefano Lise9Centre for Evolution and Cancer, The Institute of Cancer ResearchCentre for Evolution and Cancer, The Institute of Cancer ResearchUCL Cancer Institute, University College LondonPeter MacCallum Cancer Centre and University of MelbournePeter MacCallum Cancer Centre and University of MelbourneDepartment of Urology, Sahlgrenska Academy, University of GothenburgDepartment of Medical Oncology, Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori (IRST) IRCCSUCL Cancer Institute, University College LondonUCL Cancer Institute, University College LondonCentre for Evolution and Cancer, The Institute of Cancer ResearchAbstract Background Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs). Methods To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection. Results Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments. Conclusions AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve.http://link.springer.com/article/10.1186/s12920-019-0557-9Next generation sequencing (NGS)Cancer genomicsVariant callingDeep sequencingTargeted sequencingIon torrent
spellingShingle Dimitrios Kleftogiannis
Marco Punta
Anuradha Jayaram
Shahneen Sandhu
Stephen Q. Wong
Delila Gasi Tandefelt
Vincenza Conteduca
Daniel Wetterskog
Gerhardt Attard
Stefano Lise
Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
BMC Medical Genomics
Next generation sequencing (NGS)
Cancer genomics
Variant calling
Deep sequencing
Targeted sequencing
Ion torrent
title Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_full Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_fullStr Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_full_unstemmed Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_short Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_sort identification of single nucleotide variants using position specific error estimation in deep sequencing data
topic Next generation sequencing (NGS)
Cancer genomics
Variant calling
Deep sequencing
Targeted sequencing
Ion torrent
url http://link.springer.com/article/10.1186/s12920-019-0557-9
work_keys_str_mv AT dimitrioskleftogiannis identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT marcopunta identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT anuradhajayaram identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT shahneensandhu identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT stephenqwong identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT delilagasitandefelt identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT vincenzaconteduca identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT danielwetterskog identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT gerhardtattard identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT stefanolise identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata