Comprehensive benchmarking of SNV callers for highly admixed tumor data.

Precision medicine attempts to individualize cancer therapy by matching tumor-specific genetic changes with effective targeted therapies. A crucial first step in this process is the reliable identification of cancer-relevant variants, which is considerably complicated by the impurity and heterogenei...

Full description

Bibliographic Details
Main Authors: Regina Bohnert, Sonia Vivas, Gunther Jansen
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2017-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5636151?pdf=render
_version_ 1818268670954045440
author Regina Bohnert
Sonia Vivas
Gunther Jansen
author_facet Regina Bohnert
Sonia Vivas
Gunther Jansen
author_sort Regina Bohnert
collection DOAJ
description Precision medicine attempts to individualize cancer therapy by matching tumor-specific genetic changes with effective targeted therapies. A crucial first step in this process is the reliable identification of cancer-relevant variants, which is considerably complicated by the impurity and heterogeneity of clinical tumor samples. We compared the impact of admixture of non-cancerous cells and low somatic allele frequencies on the sensitivity and precision of 19 state-of-the-art SNV callers. We studied both whole exome and targeted gene panel data and up to 13 distinct parameter configurations for each tool. We found vast differences among callers. Based on our comprehensive analyses we recommend joint tumor-normal calling with MuTect, EBCall or Strelka for whole exome somatic variant calling, and HaplotypeCaller or FreeBayes for whole exome germline calling. For targeted gene panel data on a single tumor sample, LoFreqStar performed best. We further found that tumor impurity and admixture had a negative impact on precision, and in particular, sensitivity in whole exome experiments. At admixture levels of 60% to 90% sometimes seen in pathological biopsies, sensitivity dropped significantly, even when variants were originally present in the tumor at 100% allele frequency. Sensitivity to low-frequency SNVs improved with targeted panel data, but whole exome data allowed more efficient identification of germline variants. Effective somatic variant calling requires high-quality pathological samples with minimal admixture, a consciously selected sequencing strategy, and the appropriate variant calling tool with settings optimized for the chosen type of data.
first_indexed 2024-12-12T20:42:11Z
format Article
id doaj.art-8106aaef3fd743e59c87bc1c68a798d3
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-12T20:42:11Z
publishDate 2017-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-8106aaef3fd743e59c87bc1c68a798d32022-12-22T00:12:41ZengPublic Library of Science (PLoS)PLoS ONE1932-62032017-01-011210e018617510.1371/journal.pone.0186175Comprehensive benchmarking of SNV callers for highly admixed tumor data.Regina BohnertSonia VivasGunther JansenPrecision medicine attempts to individualize cancer therapy by matching tumor-specific genetic changes with effective targeted therapies. A crucial first step in this process is the reliable identification of cancer-relevant variants, which is considerably complicated by the impurity and heterogeneity of clinical tumor samples. We compared the impact of admixture of non-cancerous cells and low somatic allele frequencies on the sensitivity and precision of 19 state-of-the-art SNV callers. We studied both whole exome and targeted gene panel data and up to 13 distinct parameter configurations for each tool. We found vast differences among callers. Based on our comprehensive analyses we recommend joint tumor-normal calling with MuTect, EBCall or Strelka for whole exome somatic variant calling, and HaplotypeCaller or FreeBayes for whole exome germline calling. For targeted gene panel data on a single tumor sample, LoFreqStar performed best. We further found that tumor impurity and admixture had a negative impact on precision, and in particular, sensitivity in whole exome experiments. At admixture levels of 60% to 90% sometimes seen in pathological biopsies, sensitivity dropped significantly, even when variants were originally present in the tumor at 100% allele frequency. Sensitivity to low-frequency SNVs improved with targeted panel data, but whole exome data allowed more efficient identification of germline variants. Effective somatic variant calling requires high-quality pathological samples with minimal admixture, a consciously selected sequencing strategy, and the appropriate variant calling tool with settings optimized for the chosen type of data.http://europepmc.org/articles/PMC5636151?pdf=render
spellingShingle Regina Bohnert
Sonia Vivas
Gunther Jansen
Comprehensive benchmarking of SNV callers for highly admixed tumor data.
PLoS ONE
title Comprehensive benchmarking of SNV callers for highly admixed tumor data.
title_full Comprehensive benchmarking of SNV callers for highly admixed tumor data.
title_fullStr Comprehensive benchmarking of SNV callers for highly admixed tumor data.
title_full_unstemmed Comprehensive benchmarking of SNV callers for highly admixed tumor data.
title_short Comprehensive benchmarking of SNV callers for highly admixed tumor data.
title_sort comprehensive benchmarking of snv callers for highly admixed tumor data
url http://europepmc.org/articles/PMC5636151?pdf=render
work_keys_str_mv AT reginabohnert comprehensivebenchmarkingofsnvcallersforhighlyadmixedtumordata
AT soniavivas comprehensivebenchmarkingofsnvcallersforhighlyadmixedtumordata
AT guntherjansen comprehensivebenchmarkingofsnvcallersforhighlyadmixedtumordata