Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data

<p>Abstract</p> <p>Background</p> <p>The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have b...

Full description

Bibliographic Details
Main Authors: Mosén-Ansorena David, Aransay Ana, Rodríguez-Ezpeleta Naiara
Format: Article
Language:English
Published: BMC 2012-08-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/13/192
_version_ 1811278496146653184
author Mosén-Ansorena David
Aransay Ana
Rodríguez-Ezpeleta Naiara
author_facet Mosén-Ansorena David
Aransay Ana
Rodríguez-Ezpeleta Naiara
author_sort Mosén-Ansorena David
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly assessed. To this aim, a comprehensive model that integrates the factors of normal cell contamination and intra-tumour heterogeneity and that can be translated to synthetic data on which to perform benchmarks is indispensable.</p> <p>Results</p> <p>We propose such model and implement it in an R package called CnaGen to synthetically generate a wide range of alterations under different normal cell contamination levels. Six recently published methods for CNA and loss of heterozygosity (LOH) detection on tumour samples were assessed on this synthetic data and on a dilution series of a breast cancer cell-line: ASCAT, GAP, GenoCNA, GPHMM, MixHMM and OncoSNP. We report the recall rates in terms of normal cell contamination levels and alteration characteristics: length, copy number and LOH state, as well as the false discovery rate distribution for each copy number under different normal cell contamination levels.</p> <p>Assessed methods are in general better at detecting alterations with low copy number and under a little normal cell contamination levels. All methods except GPHMM, which failed to recognize the alteration pattern in the cell-line samples, provided similar results for the synthetic and cell-line sample sets. MixHMM and GenoCNA are the poorliest performing methods, while GAP generally performed better. This supports the viability of approaches other than the common hidden Markov model (HMM)-based.</p> <p>Conclusions</p> <p>We devised and implemented a comprehensive model to generate data that simulate tumoural samples genotyped using SNP arrays. The validity of the model is supported by the similarity of the results obtained with synthetic and real data. Based on these results and on the software implementation of the methods, we recommend GAP for advanced users and GPHMM for a fully driven analysis.</p>
first_indexed 2024-04-13T00:36:58Z
format Article
id doaj.art-acf9cef2854a4abcb8cb0822f58ce1e3
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-13T00:36:58Z
publishDate 2012-08-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-acf9cef2854a4abcb8cb0822f58ce1e32022-12-22T03:10:19ZengBMCBMC Bioinformatics1471-21052012-08-0113119210.1186/1471-2105-13-192Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping dataMosén-Ansorena DavidAransay AnaRodríguez-Ezpeleta Naiara<p>Abstract</p> <p>Background</p> <p>The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly assessed. To this aim, a comprehensive model that integrates the factors of normal cell contamination and intra-tumour heterogeneity and that can be translated to synthetic data on which to perform benchmarks is indispensable.</p> <p>Results</p> <p>We propose such model and implement it in an R package called CnaGen to synthetically generate a wide range of alterations under different normal cell contamination levels. Six recently published methods for CNA and loss of heterozygosity (LOH) detection on tumour samples were assessed on this synthetic data and on a dilution series of a breast cancer cell-line: ASCAT, GAP, GenoCNA, GPHMM, MixHMM and OncoSNP. We report the recall rates in terms of normal cell contamination levels and alteration characteristics: length, copy number and LOH state, as well as the false discovery rate distribution for each copy number under different normal cell contamination levels.</p> <p>Assessed methods are in general better at detecting alterations with low copy number and under a little normal cell contamination levels. All methods except GPHMM, which failed to recognize the alteration pattern in the cell-line samples, provided similar results for the synthetic and cell-line sample sets. MixHMM and GenoCNA are the poorliest performing methods, while GAP generally performed better. This supports the viability of approaches other than the common hidden Markov model (HMM)-based.</p> <p>Conclusions</p> <p>We devised and implemented a comprehensive model to generate data that simulate tumoural samples genotyped using SNP arrays. The validity of the model is supported by the similarity of the results obtained with synthetic and real data. Based on these results and on the software implementation of the methods, we recommend GAP for advanced users and GPHMM for a fully driven analysis.</p>http://www.biomedcentral.com/1471-2105/13/192
spellingShingle Mosén-Ansorena David
Aransay Ana
Rodríguez-Ezpeleta Naiara
Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
BMC Bioinformatics
title Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_full Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_fullStr Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_full_unstemmed Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_short Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_sort comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
url http://www.biomedcentral.com/1471-2105/13/192
work_keys_str_mv AT mosenansorenadavid comparisonofmethodstodetectcopynumberalterationsincancerusingsimulatedandrealgenotypingdata
AT aransayana comparisonofmethodstodetectcopynumberalterationsincancerusingsimulatedandrealgenotypingdata
AT rodriguezezpeletanaiara comparisonofmethodstodetectcopynumberalterationsincancerusingsimulatedandrealgenotypingdata