A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data

Copy-number variations (CNVs) have important clinical implications for several diseases and cancers. Relevant CNVs are hard to detect because common structural variations define large parts of the human genome. CNV calling from short-read sequencing would allow single protocol full genomic profiling...

Full description

Bibliographic Details
Main Authors: Migle Gabrielaite, Mathias Husted Torp, Malthe Sebro Rasmussen, Sergio Andreu-Sánchez, Filipe Garrett Vieira, Christina Bligaard Pedersen, Savvas Kinalis, Majbritt Busk Madsen, Miyako Kodama, Gül Sude Demircan, Arman Simonyan, Christina Westmose Yde, Lars Rønn Olsen, Rasmus L. Marvig, Olga Østrup, Maria Rossing, Finn Cilius Nielsen, Ole Winther, Frederik Otzen Bagger
Format: Article
Language:English
Published: MDPI AG 2021-12-01
Series:Cancers
Subjects:
Online Access:https://www.mdpi.com/2072-6694/13/24/6283
_version_ 1797506376411381760
author Migle Gabrielaite
Mathias Husted Torp
Malthe Sebro Rasmussen
Sergio Andreu-Sánchez
Filipe Garrett Vieira
Christina Bligaard Pedersen
Savvas Kinalis
Majbritt Busk Madsen
Miyako Kodama
Gül Sude Demircan
Arman Simonyan
Christina Westmose Yde
Lars Rønn Olsen
Rasmus L. Marvig
Olga Østrup
Maria Rossing
Finn Cilius Nielsen
Ole Winther
Frederik Otzen Bagger
author_facet Migle Gabrielaite
Mathias Husted Torp
Malthe Sebro Rasmussen
Sergio Andreu-Sánchez
Filipe Garrett Vieira
Christina Bligaard Pedersen
Savvas Kinalis
Majbritt Busk Madsen
Miyako Kodama
Gül Sude Demircan
Arman Simonyan
Christina Westmose Yde
Lars Rønn Olsen
Rasmus L. Marvig
Olga Østrup
Maria Rossing
Finn Cilius Nielsen
Ole Winther
Frederik Otzen Bagger
author_sort Migle Gabrielaite
collection DOAJ
description Copy-number variations (CNVs) have important clinical implications for several diseases and cancers. Relevant CNVs are hard to detect because common structural variations define large parts of the human genome. CNV calling from short-read sequencing would allow single protocol full genomic profiling. We reviewed 50 popular CNV calling tools and included 11 tools for benchmarking in a reference cohort encompassing 39 whole genome sequencing (WGS) samples paired current clinical standard—SNP-array based CNV calling. Additionally, for nine samples we also performed whole exome sequencing (WES), to address the effect of sequencing protocol on CNV calling. Furthermore, we included Gold Standard reference sample NA12878, and tested 12 samples with CNVs confirmed by multiplex ligation-dependent probe amplification (MLPA). Tool performance varied greatly in the number of called CNVs and bias for CNV lengths. Some tools had near-perfect recall of CNVs from arrays for some samples, but poor precision. Several tools had better performance for NA12878, which could be a result of overfitting. We suggest combining the best tools also based on different methodologies: GATK gCNV, Lumpy, DELLY, and cn.MOPS. Reducing the total number of called variants could potentially be assisted by the use of background panels for filtering of frequently called variants.
first_indexed 2024-03-10T04:31:48Z
format Article
id doaj.art-6b2c4b0a3a42411090a0afaa347ed145
institution Directory Open Access Journal
issn 2072-6694
language English
last_indexed 2024-03-10T04:31:48Z
publishDate 2021-12-01
publisher MDPI AG
record_format Article
series Cancers
spelling doaj.art-6b2c4b0a3a42411090a0afaa347ed1452023-11-23T04:06:24ZengMDPI AGCancers2072-66942021-12-011324628310.3390/cancers13246283A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing DataMigle Gabrielaite0Mathias Husted Torp1Malthe Sebro Rasmussen2Sergio Andreu-Sánchez3Filipe Garrett Vieira4Christina Bligaard Pedersen5Savvas Kinalis6Majbritt Busk Madsen7Miyako Kodama8Gül Sude Demircan9Arman Simonyan10Christina Westmose Yde11Lars Rønn Olsen12Rasmus L. Marvig13Olga Østrup14Maria Rossing15Finn Cilius Nielsen16Ole Winther17Frederik Otzen Bagger18Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCopy-number variations (CNVs) have important clinical implications for several diseases and cancers. Relevant CNVs are hard to detect because common structural variations define large parts of the human genome. CNV calling from short-read sequencing would allow single protocol full genomic profiling. We reviewed 50 popular CNV calling tools and included 11 tools for benchmarking in a reference cohort encompassing 39 whole genome sequencing (WGS) samples paired current clinical standard—SNP-array based CNV calling. Additionally, for nine samples we also performed whole exome sequencing (WES), to address the effect of sequencing protocol on CNV calling. Furthermore, we included Gold Standard reference sample NA12878, and tested 12 samples with CNVs confirmed by multiplex ligation-dependent probe amplification (MLPA). Tool performance varied greatly in the number of called CNVs and bias for CNV lengths. Some tools had near-perfect recall of CNVs from arrays for some samples, but poor precision. Several tools had better performance for NA12878, which could be a result of overfitting. We suggest combining the best tools also based on different methodologies: GATK gCNV, Lumpy, DELLY, and cn.MOPS. Reducing the total number of called variants could potentially be assisted by the use of background panels for filtering of frequently called variants.https://www.mdpi.com/2072-6694/13/24/6283copy-number variation (CNV)whole genome sequencing (WGS)whole exome sequencing (WES)benchmarkbioinformaticsstructural variant
spellingShingle Migle Gabrielaite
Mathias Husted Torp
Malthe Sebro Rasmussen
Sergio Andreu-Sánchez
Filipe Garrett Vieira
Christina Bligaard Pedersen
Savvas Kinalis
Majbritt Busk Madsen
Miyako Kodama
Gül Sude Demircan
Arman Simonyan
Christina Westmose Yde
Lars Rønn Olsen
Rasmus L. Marvig
Olga Østrup
Maria Rossing
Finn Cilius Nielsen
Ole Winther
Frederik Otzen Bagger
A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data
Cancers
copy-number variation (CNV)
whole genome sequencing (WGS)
whole exome sequencing (WES)
benchmark
bioinformatics
structural variant
title A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data
title_full A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data
title_fullStr A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data
title_full_unstemmed A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data
title_short A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data
title_sort comparison of tools for copy number variation detection in germline whole exome and whole genome sequencing data
topic copy-number variation (CNV)
whole genome sequencing (WGS)
whole exome sequencing (WES)
benchmark
bioinformatics
structural variant
url https://www.mdpi.com/2072-6694/13/24/6283
work_keys_str_mv AT miglegabrielaite acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT mathiashustedtorp acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT malthesebrorasmussen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT sergioandreusanchez acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT filipegarrettvieira acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT christinabligaardpedersen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT savvaskinalis acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT majbrittbuskmadsen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT miyakokodama acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT gulsudedemircan acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT armansimonyan acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT christinawestmoseyde acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT larsrønnolsen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT rasmuslmarvig acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT olgaøstrup acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT mariarossing acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT finnciliusnielsen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT olewinther acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT frederikotzenbagger acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT miglegabrielaite comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT mathiashustedtorp comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT malthesebrorasmussen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT sergioandreusanchez comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT filipegarrettvieira comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT christinabligaardpedersen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT savvaskinalis comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT majbrittbuskmadsen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT miyakokodama comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT gulsudedemircan comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT armansimonyan comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT christinawestmoseyde comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT larsrønnolsen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT rasmuslmarvig comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT olgaøstrup comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT mariarossing comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT finnciliusnielsen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT olewinther comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata
AT frederikotzenbagger comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata