Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers

Abstract Background Recently, copy number variations (CNV) impacting genes involved in oncogenic pathways have attracted an increasing attention to manage disease susceptibility. CNV is one of the most important somatic aberrations in the genome of tumor cells. Oncogene activation and tumor suppress...

Full description

Bibliographic Details
Main Authors: Pierre-Julien Viailly, Vincent Sater, Mathieu Viennot, Elodie Bohers, Nicolas Vergne, Caroline Berard, Hélène Dauchel, Thierry Lecroq, Alison Celebi, Philippe Ruminy, Vinciane Marchand, Marie-Delphine Lanic, Sydney Dubois, Dominique Penther, Hervé Tilly, Sylvain Mareschal, Fabrice Jardin
Format: Article
Language:English
Published: BMC 2021-03-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-021-04060-4
_version_ 1829521179858173952
author Pierre-Julien Viailly
Vincent Sater
Mathieu Viennot
Elodie Bohers
Nicolas Vergne
Caroline Berard
Hélène Dauchel
Thierry Lecroq
Alison Celebi
Philippe Ruminy
Vinciane Marchand
Marie-Delphine Lanic
Sydney Dubois
Dominique Penther
Hervé Tilly
Sylvain Mareschal
Fabrice Jardin
author_facet Pierre-Julien Viailly
Vincent Sater
Mathieu Viennot
Elodie Bohers
Nicolas Vergne
Caroline Berard
Hélène Dauchel
Thierry Lecroq
Alison Celebi
Philippe Ruminy
Vinciane Marchand
Marie-Delphine Lanic
Sydney Dubois
Dominique Penther
Hervé Tilly
Sylvain Mareschal
Fabrice Jardin
author_sort Pierre-Julien Viailly
collection DOAJ
description Abstract Background Recently, copy number variations (CNV) impacting genes involved in oncogenic pathways have attracted an increasing attention to manage disease susceptibility. CNV is one of the most important somatic aberrations in the genome of tumor cells. Oncogene activation and tumor suppressor gene inactivation are often attributed to copy number gain/amplification or deletion, respectively, in many cancer types and stages. Recent advances in next generation sequencing protocols allow for the addition of unique molecular identifiers (UMI) to each read. Each targeted DNA fragment is labeled with a unique random nucleotide sequence added to sequencing primers. UMI are especially useful for CNV detection by making each DNA molecule in a population of reads distinct. Results Here, we present molecular Copy Number Alteration (mCNA), a new methodology allowing the detection of copy number changes using UMI. The algorithm is composed of four main steps: the construction of UMI count matrices, the use of control samples to construct a pseudo-reference, the computation of log-ratios, the segmentation and finally the statistical inference of abnormal segmented breaks. We demonstrate the success of mCNA on a dataset of patients suffering from Diffuse Large B-cell Lymphoma and we highlight that mCNA results have a strong correlation with comparative genomic hybridization. Conclusion We provide mCNA, a new approach for CNV detection, freely available at https://gitlab.com/pierrejulien.viailly/mcna/ under MIT license. mCNA can significantly improve detection accuracy of CNV changes by using UMI.
first_indexed 2024-12-16T15:10:53Z
format Article
id doaj.art-b229cd4e14c749a68bcf4b30c189122a
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-16T15:10:53Z
publishDate 2021-03-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-b229cd4e14c749a68bcf4b30c189122a2022-12-21T22:26:58ZengBMCBMC Bioinformatics1471-21052021-03-0122111510.1186/s12859-021-04060-4Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiersPierre-Julien Viailly0Vincent Sater1Mathieu Viennot2Elodie Bohers3Nicolas Vergne4Caroline Berard5Hélène Dauchel6Thierry Lecroq7Alison Celebi8Philippe Ruminy9Vinciane Marchand10Marie-Delphine Lanic11Sydney Dubois12Dominique Penther13Hervé Tilly14Sylvain Mareschal15Fabrice Jardin16INSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENLMRS UMRS 6085, Normandie Univ, UNIROUENLITIS EA 4108, Normandie Univ, UNIROUENLITIS EA 4108, Normandie Univ, UNIROUENLITIS EA 4108, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENINSERM U1052 UMR CNRS 5286, Cancer Research Center of LyonINSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUENAbstract Background Recently, copy number variations (CNV) impacting genes involved in oncogenic pathways have attracted an increasing attention to manage disease susceptibility. CNV is one of the most important somatic aberrations in the genome of tumor cells. Oncogene activation and tumor suppressor gene inactivation are often attributed to copy number gain/amplification or deletion, respectively, in many cancer types and stages. Recent advances in next generation sequencing protocols allow for the addition of unique molecular identifiers (UMI) to each read. Each targeted DNA fragment is labeled with a unique random nucleotide sequence added to sequencing primers. UMI are especially useful for CNV detection by making each DNA molecule in a population of reads distinct. Results Here, we present molecular Copy Number Alteration (mCNA), a new methodology allowing the detection of copy number changes using UMI. The algorithm is composed of four main steps: the construction of UMI count matrices, the use of control samples to construct a pseudo-reference, the computation of log-ratios, the segmentation and finally the statistical inference of abnormal segmented breaks. We demonstrate the success of mCNA on a dataset of patients suffering from Diffuse Large B-cell Lymphoma and we highlight that mCNA results have a strong correlation with comparative genomic hybridization. Conclusion We provide mCNA, a new approach for CNV detection, freely available at https://gitlab.com/pierrejulien.viailly/mcna/ under MIT license. mCNA can significantly improve detection accuracy of CNV changes by using UMI.https://doi.org/10.1186/s12859-021-04060-4UMICNV callingNext generation sequencing
spellingShingle Pierre-Julien Viailly
Vincent Sater
Mathieu Viennot
Elodie Bohers
Nicolas Vergne
Caroline Berard
Hélène Dauchel
Thierry Lecroq
Alison Celebi
Philippe Ruminy
Vinciane Marchand
Marie-Delphine Lanic
Sydney Dubois
Dominique Penther
Hervé Tilly
Sylvain Mareschal
Fabrice Jardin
Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers
BMC Bioinformatics
UMI
CNV calling
Next generation sequencing
title Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers
title_full Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers
title_fullStr Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers
title_full_unstemmed Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers
title_short Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers
title_sort improving high resolution copy number variation analysis from next generation sequencing using unique molecular identifiers
topic UMI
CNV calling
Next generation sequencing
url https://doi.org/10.1186/s12859-021-04060-4
work_keys_str_mv AT pierrejulienviailly improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT vincentsater improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT mathieuviennot improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT elodiebohers improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT nicolasvergne improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT carolineberard improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT helenedauchel improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT thierrylecroq improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT alisoncelebi improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT philipperuminy improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT vincianemarchand improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT mariedelphinelanic improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT sydneydubois improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT dominiquepenther improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT hervetilly improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT sylvainmareschal improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers
AT fabricejardin improvinghighresolutioncopynumbervariationanalysisfromnextgenerationsequencingusinguniquemolecularidentifiers