A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data
Copy-number variations (CNVs) have important clinical implications for several diseases and cancers. Relevant CNVs are hard to detect because common structural variations define large parts of the human genome. CNV calling from short-read sequencing would allow single protocol full genomic profiling...
Main Authors: | , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-12-01
|
Series: | Cancers |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-6694/13/24/6283 |
_version_ | 1797506376411381760 |
---|---|
author | Migle Gabrielaite Mathias Husted Torp Malthe Sebro Rasmussen Sergio Andreu-Sánchez Filipe Garrett Vieira Christina Bligaard Pedersen Savvas Kinalis Majbritt Busk Madsen Miyako Kodama Gül Sude Demircan Arman Simonyan Christina Westmose Yde Lars Rønn Olsen Rasmus L. Marvig Olga Østrup Maria Rossing Finn Cilius Nielsen Ole Winther Frederik Otzen Bagger |
author_facet | Migle Gabrielaite Mathias Husted Torp Malthe Sebro Rasmussen Sergio Andreu-Sánchez Filipe Garrett Vieira Christina Bligaard Pedersen Savvas Kinalis Majbritt Busk Madsen Miyako Kodama Gül Sude Demircan Arman Simonyan Christina Westmose Yde Lars Rønn Olsen Rasmus L. Marvig Olga Østrup Maria Rossing Finn Cilius Nielsen Ole Winther Frederik Otzen Bagger |
author_sort | Migle Gabrielaite |
collection | DOAJ |
description | Copy-number variations (CNVs) have important clinical implications for several diseases and cancers. Relevant CNVs are hard to detect because common structural variations define large parts of the human genome. CNV calling from short-read sequencing would allow single protocol full genomic profiling. We reviewed 50 popular CNV calling tools and included 11 tools for benchmarking in a reference cohort encompassing 39 whole genome sequencing (WGS) samples paired current clinical standard—SNP-array based CNV calling. Additionally, for nine samples we also performed whole exome sequencing (WES), to address the effect of sequencing protocol on CNV calling. Furthermore, we included Gold Standard reference sample NA12878, and tested 12 samples with CNVs confirmed by multiplex ligation-dependent probe amplification (MLPA). Tool performance varied greatly in the number of called CNVs and bias for CNV lengths. Some tools had near-perfect recall of CNVs from arrays for some samples, but poor precision. Several tools had better performance for NA12878, which could be a result of overfitting. We suggest combining the best tools also based on different methodologies: GATK gCNV, Lumpy, DELLY, and cn.MOPS. Reducing the total number of called variants could potentially be assisted by the use of background panels for filtering of frequently called variants. |
first_indexed | 2024-03-10T04:31:48Z |
format | Article |
id | doaj.art-6b2c4b0a3a42411090a0afaa347ed145 |
institution | Directory Open Access Journal |
issn | 2072-6694 |
language | English |
last_indexed | 2024-03-10T04:31:48Z |
publishDate | 2021-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Cancers |
spelling | doaj.art-6b2c4b0a3a42411090a0afaa347ed1452023-11-23T04:06:24ZengMDPI AGCancers2072-66942021-12-011324628310.3390/cancers13246283A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing DataMigle Gabrielaite0Mathias Husted Torp1Malthe Sebro Rasmussen2Sergio Andreu-Sánchez3Filipe Garrett Vieira4Christina Bligaard Pedersen5Savvas Kinalis6Majbritt Busk Madsen7Miyako Kodama8Gül Sude Demircan9Arman Simonyan10Christina Westmose Yde11Lars Rønn Olsen12Rasmus L. Marvig13Olga Østrup14Maria Rossing15Finn Cilius Nielsen16Ole Winther17Frederik Otzen Bagger18Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCenter for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, DenmarkCopy-number variations (CNVs) have important clinical implications for several diseases and cancers. Relevant CNVs are hard to detect because common structural variations define large parts of the human genome. CNV calling from short-read sequencing would allow single protocol full genomic profiling. We reviewed 50 popular CNV calling tools and included 11 tools for benchmarking in a reference cohort encompassing 39 whole genome sequencing (WGS) samples paired current clinical standard—SNP-array based CNV calling. Additionally, for nine samples we also performed whole exome sequencing (WES), to address the effect of sequencing protocol on CNV calling. Furthermore, we included Gold Standard reference sample NA12878, and tested 12 samples with CNVs confirmed by multiplex ligation-dependent probe amplification (MLPA). Tool performance varied greatly in the number of called CNVs and bias for CNV lengths. Some tools had near-perfect recall of CNVs from arrays for some samples, but poor precision. Several tools had better performance for NA12878, which could be a result of overfitting. We suggest combining the best tools also based on different methodologies: GATK gCNV, Lumpy, DELLY, and cn.MOPS. Reducing the total number of called variants could potentially be assisted by the use of background panels for filtering of frequently called variants.https://www.mdpi.com/2072-6694/13/24/6283copy-number variation (CNV)whole genome sequencing (WGS)whole exome sequencing (WES)benchmarkbioinformaticsstructural variant |
spellingShingle | Migle Gabrielaite Mathias Husted Torp Malthe Sebro Rasmussen Sergio Andreu-Sánchez Filipe Garrett Vieira Christina Bligaard Pedersen Savvas Kinalis Majbritt Busk Madsen Miyako Kodama Gül Sude Demircan Arman Simonyan Christina Westmose Yde Lars Rønn Olsen Rasmus L. Marvig Olga Østrup Maria Rossing Finn Cilius Nielsen Ole Winther Frederik Otzen Bagger A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data Cancers copy-number variation (CNV) whole genome sequencing (WGS) whole exome sequencing (WES) benchmark bioinformatics structural variant |
title | A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data |
title_full | A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data |
title_fullStr | A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data |
title_full_unstemmed | A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data |
title_short | A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data |
title_sort | comparison of tools for copy number variation detection in germline whole exome and whole genome sequencing data |
topic | copy-number variation (CNV) whole genome sequencing (WGS) whole exome sequencing (WES) benchmark bioinformatics structural variant |
url | https://www.mdpi.com/2072-6694/13/24/6283 |
work_keys_str_mv | AT miglegabrielaite acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT mathiashustedtorp acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT malthesebrorasmussen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT sergioandreusanchez acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT filipegarrettvieira acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT christinabligaardpedersen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT savvaskinalis acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT majbrittbuskmadsen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT miyakokodama acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT gulsudedemircan acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT armansimonyan acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT christinawestmoseyde acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT larsrønnolsen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT rasmuslmarvig acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT olgaøstrup acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT mariarossing acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT finnciliusnielsen acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT olewinther acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT frederikotzenbagger acomparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT miglegabrielaite comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT mathiashustedtorp comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT malthesebrorasmussen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT sergioandreusanchez comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT filipegarrettvieira comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT christinabligaardpedersen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT savvaskinalis comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT majbrittbuskmadsen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT miyakokodama comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT gulsudedemircan comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT armansimonyan comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT christinawestmoseyde comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT larsrønnolsen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT rasmuslmarvig comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT olgaøstrup comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT mariarossing comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT finnciliusnielsen comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT olewinther comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata AT frederikotzenbagger comparisonoftoolsforcopynumbervariationdetectioningermlinewholeexomeandwholegenomesequencingdata |