Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.

Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumor genome-in particular single nucleotide variants (SNVs). Most current computational and statistical models for analyzing next generation sequencing data, however, do not account for ca...

Full description

Bibliographic Details
Main Authors: Anamaria Crisan, Rodrigo Goya, Gavin Ha, Jiarui Ding, Leah M Prentice, Arusha Oloumi, Janine Senz, Thomas Zeng, Kane Tse, Allen Delaney, Marco A Marra, David G Huntsman, Martin Hirst, Sam Aparicio, Sohrab Shah
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3420914?pdf=render
_version_ 1818178833643208704
author Anamaria Crisan
Rodrigo Goya
Gavin Ha
Jiarui Ding
Leah M Prentice
Arusha Oloumi
Janine Senz
Thomas Zeng
Kane Tse
Allen Delaney
Marco A Marra
David G Huntsman
Martin Hirst
Sam Aparicio
Sohrab Shah
author_facet Anamaria Crisan
Rodrigo Goya
Gavin Ha
Jiarui Ding
Leah M Prentice
Arusha Oloumi
Janine Senz
Thomas Zeng
Kane Tse
Allen Delaney
Marco A Marra
David G Huntsman
Martin Hirst
Sam Aparicio
Sohrab Shah
author_sort Anamaria Crisan
collection DOAJ
description Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumor genome-in particular single nucleotide variants (SNVs). Most current computational and statistical models for analyzing next generation sequencing data, however, do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs)-which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended genotype space where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). We introduce the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to discover 21 experimentally revalidated somatic non-synonymous mutations in a lobular breast cancer genome that were not detected using copy number insensitive SNV detection algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. This was also supported by analysis of a recently published lymphoma genome with a relatively quiescent karyotype, where CoNAn-SNV showed similar results to other callers except in regions of copy number gain where increased sensitivity was conferred. Our results indicate that in genomically unstable tumors, copy number annotation for SNV detection will be critical to fully characterize the mutational landscape of cancer genomes.
first_indexed 2024-12-11T20:54:16Z
format Article
id doaj.art-c839d4c639fe4c849330294810dba997
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-11T20:54:16Z
publishDate 2012-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-c839d4c639fe4c849330294810dba9972022-12-22T00:51:09ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-0178e4155110.1371/journal.pone.0041551Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.Anamaria CrisanRodrigo GoyaGavin HaJiarui DingLeah M PrenticeArusha OloumiJanine SenzThomas ZengKane TseAllen DelaneyMarco A MarraDavid G HuntsmanMartin HirstSam AparicioSohrab ShahNext generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumor genome-in particular single nucleotide variants (SNVs). Most current computational and statistical models for analyzing next generation sequencing data, however, do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs)-which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended genotype space where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). We introduce the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to discover 21 experimentally revalidated somatic non-synonymous mutations in a lobular breast cancer genome that were not detected using copy number insensitive SNV detection algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. This was also supported by analysis of a recently published lymphoma genome with a relatively quiescent karyotype, where CoNAn-SNV showed similar results to other callers except in regions of copy number gain where increased sensitivity was conferred. Our results indicate that in genomically unstable tumors, copy number annotation for SNV detection will be critical to fully characterize the mutational landscape of cancer genomes.http://europepmc.org/articles/PMC3420914?pdf=render
spellingShingle Anamaria Crisan
Rodrigo Goya
Gavin Ha
Jiarui Ding
Leah M Prentice
Arusha Oloumi
Janine Senz
Thomas Zeng
Kane Tse
Allen Delaney
Marco A Marra
David G Huntsman
Martin Hirst
Sam Aparicio
Sohrab Shah
Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.
PLoS ONE
title Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.
title_full Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.
title_fullStr Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.
title_full_unstemmed Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.
title_short Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.
title_sort mutation discovery in regions of segmental cancer genome amplifications with conan snv a mixture model for next generation sequencing of tumors
url http://europepmc.org/articles/PMC3420914?pdf=render
work_keys_str_mv AT anamariacrisan mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT rodrigogoya mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT gavinha mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT jiaruiding mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT leahmprentice mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT arushaoloumi mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT janinesenz mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT thomaszeng mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT kanetse mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT allendelaney mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT marcoamarra mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT davidghuntsman mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT martinhirst mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT samaparicio mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors
AT sohrabshah mutationdiscoveryinregionsofsegmentalcancergenomeamplificationswithconansnvamixturemodelfornextgenerationsequencingoftumors