Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation
Abstract The short lengths of short-read sequencing reads challenge the analysis of paralogous genomic regions in exome and genome sequencing data. Most genetic variants within these homologous regions therefore remain unidentified in standard analyses. Here, we present a method (Chameleolyser) that...
Main Authors: | , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-10-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-023-42531-9 |
_version_ | 1797647283752271872 |
---|---|
author | Wouter Steyaert Lonneke Haer-Wigman Rolph Pfundt Debby Hellebrekers Marloes Steehouwer Juliet Hampstead Elke de Boer Alexander Stegmann Helger Yntema Erik-Jan Kamsteeg Han Brunner Alexander Hoischen Christian Gilissen |
author_facet | Wouter Steyaert Lonneke Haer-Wigman Rolph Pfundt Debby Hellebrekers Marloes Steehouwer Juliet Hampstead Elke de Boer Alexander Stegmann Helger Yntema Erik-Jan Kamsteeg Han Brunner Alexander Hoischen Christian Gilissen |
author_sort | Wouter Steyaert |
collection | DOAJ |
description | Abstract The short lengths of short-read sequencing reads challenge the analysis of paralogous genomic regions in exome and genome sequencing data. Most genetic variants within these homologous regions therefore remain unidentified in standard analyses. Here, we present a method (Chameleolyser) that accurately identifies single nucleotide variants and small insertions/deletions (SNVs/Indels), copy number variants and ectopic gene conversion events in duplicated genomic regions using whole-exome sequencing data. Application to a cohort of 41,755 exome samples yields 20,432 rare homozygous deletions and 2,529,791 rare SNVs/Indels, of which we show that 338,084 are due to gene conversion events. None of the SNVs/Indels are detectable using regular analysis techniques. Validation by high-fidelity long-read sequencing in 20 samples confirms >88% of called variants. Focusing on variation in known disease genes leads to a direct molecular diagnosis in 25 previously undiagnosed patients. Our method can readily be applied to existing exome data. |
first_indexed | 2024-03-11T15:14:04Z |
format | Article |
id | doaj.art-b6807c8412374d2aa25e893290d2b4c6 |
institution | Directory Open Access Journal |
issn | 2041-1723 |
language | English |
last_indexed | 2024-03-11T15:14:04Z |
publishDate | 2023-10-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj.art-b6807c8412374d2aa25e893290d2b4c62023-10-29T12:29:37ZengNature PortfolioNature Communications2041-17232023-10-0114111310.1038/s41467-023-42531-9Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variationWouter Steyaert0Lonneke Haer-Wigman1Rolph Pfundt2Debby Hellebrekers3Marloes Steehouwer4Juliet Hampstead5Elke de Boer6Alexander Stegmann7Helger Yntema8Erik-Jan Kamsteeg9Han Brunner10Alexander Hoischen11Christian Gilissen12Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterMaastricht University Medical Center + , Department of Clinical GeneticsDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterMaastricht University Medical Center + , Department of Clinical GeneticsDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterDepartment of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical CenterAbstract The short lengths of short-read sequencing reads challenge the analysis of paralogous genomic regions in exome and genome sequencing data. Most genetic variants within these homologous regions therefore remain unidentified in standard analyses. Here, we present a method (Chameleolyser) that accurately identifies single nucleotide variants and small insertions/deletions (SNVs/Indels), copy number variants and ectopic gene conversion events in duplicated genomic regions using whole-exome sequencing data. Application to a cohort of 41,755 exome samples yields 20,432 rare homozygous deletions and 2,529,791 rare SNVs/Indels, of which we show that 338,084 are due to gene conversion events. None of the SNVs/Indels are detectable using regular analysis techniques. Validation by high-fidelity long-read sequencing in 20 samples confirms >88% of called variants. Focusing on variation in known disease genes leads to a direct molecular diagnosis in 25 previously undiagnosed patients. Our method can readily be applied to existing exome data.https://doi.org/10.1038/s41467-023-42531-9 |
spellingShingle | Wouter Steyaert Lonneke Haer-Wigman Rolph Pfundt Debby Hellebrekers Marloes Steehouwer Juliet Hampstead Elke de Boer Alexander Stegmann Helger Yntema Erik-Jan Kamsteeg Han Brunner Alexander Hoischen Christian Gilissen Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation Nature Communications |
title | Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation |
title_full | Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation |
title_fullStr | Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation |
title_full_unstemmed | Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation |
title_short | Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation |
title_sort | systematic analysis of paralogous regions in 41 755 exomes uncovers clinically relevant variation |
url | https://doi.org/10.1038/s41467-023-42531-9 |
work_keys_str_mv | AT woutersteyaert systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT lonnekehaerwigman systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT rolphpfundt systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT debbyhellebrekers systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT marloessteehouwer systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT juliethampstead systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT elkedeboer systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT alexanderstegmann systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT helgeryntema systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT erikjankamsteeg systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT hanbrunner systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT alexanderhoischen systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation AT christiangilissen systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation |