A method to reduce ancestry related germline false positives in tumor only somatic variant calling

Abstract Background Significant clinical and research applications are driving large scale adoption of individualized tumor sequencing in cancer in order to identify tumors-specific mutations. When a matched germline sample is available, somatic mutations may be identified using comparative callers....

Full description

Bibliographic Details
Main Authors: Rebecca F. Halperin, John D. Carpten, Zarko Manojlovic, Jessica Aldrich, Jonathan Keats, Sara Byron, Winnie S. Liang, Megan Russell, Daniel Enriquez, Ana Claasen, Irene Cherni, Baffour Awuah, Joseph Oppong, Max S. Wicha, Lisa A. Newman, Evelyn Jaigge, Seungchan Kim, David W. Craig
Format: Article
Language:English
Published: BMC 2017-10-01
Series:BMC Medical Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12920-017-0296-8
_version_ 1819161497240076288
author Rebecca F. Halperin
John D. Carpten
Zarko Manojlovic
Jessica Aldrich
Jonathan Keats
Sara Byron
Winnie S. Liang
Megan Russell
Daniel Enriquez
Ana Claasen
Irene Cherni
Baffour Awuah
Joseph Oppong
Max S. Wicha
Lisa A. Newman
Evelyn Jaigge
Seungchan Kim
David W. Craig
author_facet Rebecca F. Halperin
John D. Carpten
Zarko Manojlovic
Jessica Aldrich
Jonathan Keats
Sara Byron
Winnie S. Liang
Megan Russell
Daniel Enriquez
Ana Claasen
Irene Cherni
Baffour Awuah
Joseph Oppong
Max S. Wicha
Lisa A. Newman
Evelyn Jaigge
Seungchan Kim
David W. Craig
author_sort Rebecca F. Halperin
collection DOAJ
description Abstract Background Significant clinical and research applications are driving large scale adoption of individualized tumor sequencing in cancer in order to identify tumors-specific mutations. When a matched germline sample is available, somatic mutations may be identified using comparative callers. However, matched germline samples are frequently not available such as with archival tissues, which makes it difficult to distinguish somatic from germline variants. While population databases may be used to filter out known germline variants, recent studies have shown private germline variants result in an inflated false positive rate in unmatched tumor samples, and the number germline false positives in an individual may be related to ancestry. Methods First, we examined the relationship between the germline false positives and ancestry. Then we developed and implemented a tumor only caller (LumosVar) that leverages differences in allelic frequency between somatic and germline variants in impure tumors. We used simulated data to systematically examine how copy number alterations, tumor purity, and sequencing depth should affect the sensitivity of our caller. Finally, we evaluated the caller on real data. Results We find the germline false-positive rate is significantly higher for individuals of non-European Ancestry largely due to the limited diversity in public polymorphism databases and due to population-specific characteristics such as admixture or recent expansions. Our Bayesian tumor only caller (LumosVar) is able to greatly reduce false positives from private germline variants, and our sensitivity is similar to predictions based on simulated data. Conclusions Taken together, our results suggest that studies of individuals of non-European ancestry would most benefit from our approach. However, high sensitivity requires sufficiently impure tumors and adequate sequencing depth. Even in impure tumors, there are copy number alterations that result in germline and somatic variants having similar allele frequencies, limiting the sensitivity of the approach. We believe our approach could greatly improve the analysis of archival samples in a research setting where the normal is not available.
first_indexed 2024-12-22T17:13:17Z
format Article
id doaj.art-fde1123b7aa84a3eadeb4956afa4c35d
institution Directory Open Access Journal
issn 1755-8794
language English
last_indexed 2024-12-22T17:13:17Z
publishDate 2017-10-01
publisher BMC
record_format Article
series BMC Medical Genomics
spelling doaj.art-fde1123b7aa84a3eadeb4956afa4c35d2022-12-21T18:19:01ZengBMCBMC Medical Genomics1755-87942017-10-0110111710.1186/s12920-017-0296-8A method to reduce ancestry related germline false positives in tumor only somatic variant callingRebecca F. Halperin0John D. Carpten1Zarko Manojlovic2Jessica Aldrich3Jonathan Keats4Sara Byron5Winnie S. Liang6Megan Russell7Daniel Enriquez8Ana Claasen9Irene Cherni10Baffour Awuah11Joseph Oppong12Max S. Wicha13Lisa A. Newman14Evelyn Jaigge15Seungchan Kim16David W. Craig17Center for Translational Innovation, Translational Genomics Research InstituteDepartment of Translational Genomics, University of Southern CaliforniaDepartment of Translational Genomics, University of Southern CaliforniaIntegrated Cancer Division, Translational Genomics Research InstituteIntegrated Cancer Division, Translational Genomics Research InstituteCenter for Translational Innovation, Translational Genomics Research InstituteNeurogenomics Division, Translational Genomics Research InstituteIntegrated Cancer Division, Translational Genomics Research InstituteNeurogenomics Division, Translational Genomics Research InstituteNeurogenomics Division, Translational Genomics Research InstituteIntegrated Cancer Division, Translational Genomics Research InstituteKomfo Anokye Teaching HospitalKomfo Anokye Teaching HospitalUniversity of MichiganHenry Ford Health SystemsUniversity of MichiganIntegrated Cancer Division, Translational Genomics Research InstituteDepartment of Translational Genomics, University of Southern CaliforniaAbstract Background Significant clinical and research applications are driving large scale adoption of individualized tumor sequencing in cancer in order to identify tumors-specific mutations. When a matched germline sample is available, somatic mutations may be identified using comparative callers. However, matched germline samples are frequently not available such as with archival tissues, which makes it difficult to distinguish somatic from germline variants. While population databases may be used to filter out known germline variants, recent studies have shown private germline variants result in an inflated false positive rate in unmatched tumor samples, and the number germline false positives in an individual may be related to ancestry. Methods First, we examined the relationship between the germline false positives and ancestry. Then we developed and implemented a tumor only caller (LumosVar) that leverages differences in allelic frequency between somatic and germline variants in impure tumors. We used simulated data to systematically examine how copy number alterations, tumor purity, and sequencing depth should affect the sensitivity of our caller. Finally, we evaluated the caller on real data. Results We find the germline false-positive rate is significantly higher for individuals of non-European Ancestry largely due to the limited diversity in public polymorphism databases and due to population-specific characteristics such as admixture or recent expansions. Our Bayesian tumor only caller (LumosVar) is able to greatly reduce false positives from private germline variants, and our sensitivity is similar to predictions based on simulated data. Conclusions Taken together, our results suggest that studies of individuals of non-European ancestry would most benefit from our approach. However, high sensitivity requires sufficiently impure tumors and adequate sequencing depth. Even in impure tumors, there are copy number alterations that result in germline and somatic variants having similar allele frequencies, limiting the sensitivity of the approach. We believe our approach could greatly improve the analysis of archival samples in a research setting where the normal is not available.http://link.springer.com/article/10.1186/s12920-017-0296-8Somatic mutationGermline variantNext generation sequencingCancerPrecision medicineTumor purity
spellingShingle Rebecca F. Halperin
John D. Carpten
Zarko Manojlovic
Jessica Aldrich
Jonathan Keats
Sara Byron
Winnie S. Liang
Megan Russell
Daniel Enriquez
Ana Claasen
Irene Cherni
Baffour Awuah
Joseph Oppong
Max S. Wicha
Lisa A. Newman
Evelyn Jaigge
Seungchan Kim
David W. Craig
A method to reduce ancestry related germline false positives in tumor only somatic variant calling
BMC Medical Genomics
Somatic mutation
Germline variant
Next generation sequencing
Cancer
Precision medicine
Tumor purity
title A method to reduce ancestry related germline false positives in tumor only somatic variant calling
title_full A method to reduce ancestry related germline false positives in tumor only somatic variant calling
title_fullStr A method to reduce ancestry related germline false positives in tumor only somatic variant calling
title_full_unstemmed A method to reduce ancestry related germline false positives in tumor only somatic variant calling
title_short A method to reduce ancestry related germline false positives in tumor only somatic variant calling
title_sort method to reduce ancestry related germline false positives in tumor only somatic variant calling
topic Somatic mutation
Germline variant
Next generation sequencing
Cancer
Precision medicine
Tumor purity
url http://link.springer.com/article/10.1186/s12920-017-0296-8
work_keys_str_mv AT rebeccafhalperin amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT johndcarpten amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT zarkomanojlovic amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT jessicaaldrich amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT jonathankeats amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT sarabyron amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT winniesliang amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT meganrussell amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT danielenriquez amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT anaclaasen amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT irenecherni amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT baffourawuah amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT josephoppong amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT maxswicha amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT lisaanewman amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT evelynjaigge amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT seungchankim amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT davidwcraig amethodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT rebeccafhalperin methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT johndcarpten methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT zarkomanojlovic methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT jessicaaldrich methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT jonathankeats methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT sarabyron methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT winniesliang methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT meganrussell methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT danielenriquez methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT anaclaasen methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT irenecherni methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT baffourawuah methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT josephoppong methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT maxswicha methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT lisaanewman methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT evelynjaigge methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT seungchankim methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling
AT davidwcraig methodtoreduceancestryrelatedgermlinefalsepositivesintumoronlysomaticvariantcalling