Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
Abstract Background Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a chall...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-08-01
|
Series: | BMC Medical Genomics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12920-019-0557-9 |
_version_ | 1819069238051078144 |
---|---|
author | Dimitrios Kleftogiannis Marco Punta Anuradha Jayaram Shahneen Sandhu Stephen Q. Wong Delila Gasi Tandefelt Vincenza Conteduca Daniel Wetterskog Gerhardt Attard Stefano Lise |
author_facet | Dimitrios Kleftogiannis Marco Punta Anuradha Jayaram Shahneen Sandhu Stephen Q. Wong Delila Gasi Tandefelt Vincenza Conteduca Daniel Wetterskog Gerhardt Attard Stefano Lise |
author_sort | Dimitrios Kleftogiannis |
collection | DOAJ |
description | Abstract Background Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs). Methods To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection. Results Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments. Conclusions AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve. |
first_indexed | 2024-12-21T16:46:52Z |
format | Article |
id | doaj.art-616254a8c8714710aa88f59e833b3a08 |
institution | Directory Open Access Journal |
issn | 1755-8794 |
language | English |
last_indexed | 2024-12-21T16:46:52Z |
publishDate | 2019-08-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Genomics |
spelling | doaj.art-616254a8c8714710aa88f59e833b3a082022-12-21T18:56:58ZengBMCBMC Medical Genomics1755-87942019-08-0112111210.1186/s12920-019-0557-9Identification of single nucleotide variants using position-specific error estimation in deep sequencing dataDimitrios Kleftogiannis0Marco Punta1Anuradha Jayaram2Shahneen Sandhu3Stephen Q. Wong4Delila Gasi Tandefelt5Vincenza Conteduca6Daniel Wetterskog7Gerhardt Attard8Stefano Lise9Centre for Evolution and Cancer, The Institute of Cancer ResearchCentre for Evolution and Cancer, The Institute of Cancer ResearchUCL Cancer Institute, University College LondonPeter MacCallum Cancer Centre and University of MelbournePeter MacCallum Cancer Centre and University of MelbourneDepartment of Urology, Sahlgrenska Academy, University of GothenburgDepartment of Medical Oncology, Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori (IRST) IRCCSUCL Cancer Institute, University College LondonUCL Cancer Institute, University College LondonCentre for Evolution and Cancer, The Institute of Cancer ResearchAbstract Background Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs). Methods To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection. Results Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments. Conclusions AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve.http://link.springer.com/article/10.1186/s12920-019-0557-9Next generation sequencing (NGS)Cancer genomicsVariant callingDeep sequencingTargeted sequencingIon torrent |
spellingShingle | Dimitrios Kleftogiannis Marco Punta Anuradha Jayaram Shahneen Sandhu Stephen Q. Wong Delila Gasi Tandefelt Vincenza Conteduca Daniel Wetterskog Gerhardt Attard Stefano Lise Identification of single nucleotide variants using position-specific error estimation in deep sequencing data BMC Medical Genomics Next generation sequencing (NGS) Cancer genomics Variant calling Deep sequencing Targeted sequencing Ion torrent |
title | Identification of single nucleotide variants using position-specific error estimation in deep sequencing data |
title_full | Identification of single nucleotide variants using position-specific error estimation in deep sequencing data |
title_fullStr | Identification of single nucleotide variants using position-specific error estimation in deep sequencing data |
title_full_unstemmed | Identification of single nucleotide variants using position-specific error estimation in deep sequencing data |
title_short | Identification of single nucleotide variants using position-specific error estimation in deep sequencing data |
title_sort | identification of single nucleotide variants using position specific error estimation in deep sequencing data |
topic | Next generation sequencing (NGS) Cancer genomics Variant calling Deep sequencing Targeted sequencing Ion torrent |
url | http://link.springer.com/article/10.1186/s12920-019-0557-9 |
work_keys_str_mv | AT dimitrioskleftogiannis identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT marcopunta identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT anuradhajayaram identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT shahneensandhu identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT stephenqwong identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT delilagasitandefelt identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT vincenzaconteduca identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT danielwetterskog identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT gerhardtattard identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata AT stefanolise identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata |