Simulation of undiagnosed patients with novel genetic conditions

Abstract Rare Mendelian disorders pose a major diagnostic challenge and collectively affect 300–400 million patients worldwide. Many automated tools aim to uncover causal genes in patients with suspected genetic disorders, but evaluation of these tools is limited due to the lack of comprehensive ben...

Full description

Bibliographic Details
Main Authors: Emily Alsentzer, Samuel G. Finlayson, Michelle M. Li, Undiagnosed Diseases Network, Shilpa N. Kobren, Isaac S. Kohane
Format: Article
Language:English
Published: Nature Portfolio 2023-10-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-023-41980-6
_version_ 1797558730158505984
author Emily Alsentzer
Samuel G. Finlayson
Michelle M. Li
Undiagnosed Diseases Network
Shilpa N. Kobren
Isaac S. Kohane
author_facet Emily Alsentzer
Samuel G. Finlayson
Michelle M. Li
Undiagnosed Diseases Network
Shilpa N. Kobren
Isaac S. Kohane
author_sort Emily Alsentzer
collection DOAJ
description Abstract Rare Mendelian disorders pose a major diagnostic challenge and collectively affect 300–400 million patients worldwide. Many automated tools aim to uncover causal genes in patients with suspected genetic disorders, but evaluation of these tools is limited due to the lack of comprehensive benchmark datasets that include previously unpublished conditions. Here, we present a computational pipeline that simulates realistic clinical datasets to address this deficit. Our framework jointly simulates complex phenotypes and challenging candidate genes and produces patients with novel genetic conditions. We demonstrate the similarity of our simulated patients to real patients from the Undiagnosed Diseases Network and evaluate common gene prioritization methods on the simulated cohort. These prioritization methods recover known gene-disease associations but perform poorly on diagnosing patients with novel genetic disorders. Our publicly-available dataset and codebase can be utilized by medical genetics researchers to evaluate, compare, and improve tools that aid in the diagnostic process.
first_indexed 2024-03-10T17:35:37Z
format Article
id doaj.art-83597d6332fa45a3a95b8e4a66d94735
institution Directory Open Access Journal
issn 2041-1723
language English
last_indexed 2024-03-10T17:35:37Z
publishDate 2023-10-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj.art-83597d6332fa45a3a95b8e4a66d947352023-11-20T09:52:43ZengNature PortfolioNature Communications2041-17232023-10-0114111310.1038/s41467-023-41980-6Simulation of undiagnosed patients with novel genetic conditionsEmily Alsentzer0Samuel G. Finlayson1Michelle M. Li2Undiagnosed Diseases NetworkShilpa N. Kobren3Isaac S. Kohane4Department of Biomedical Informatics, Harvard Medical SchoolDepartment of Biomedical Informatics, Harvard Medical SchoolDepartment of Biomedical Informatics, Harvard Medical SchoolDepartment of Biomedical Informatics, Harvard Medical SchoolDepartment of Biomedical Informatics, Harvard Medical SchoolAbstract Rare Mendelian disorders pose a major diagnostic challenge and collectively affect 300–400 million patients worldwide. Many automated tools aim to uncover causal genes in patients with suspected genetic disorders, but evaluation of these tools is limited due to the lack of comprehensive benchmark datasets that include previously unpublished conditions. Here, we present a computational pipeline that simulates realistic clinical datasets to address this deficit. Our framework jointly simulates complex phenotypes and challenging candidate genes and produces patients with novel genetic conditions. We demonstrate the similarity of our simulated patients to real patients from the Undiagnosed Diseases Network and evaluate common gene prioritization methods on the simulated cohort. These prioritization methods recover known gene-disease associations but perform poorly on diagnosing patients with novel genetic disorders. Our publicly-available dataset and codebase can be utilized by medical genetics researchers to evaluate, compare, and improve tools that aid in the diagnostic process.https://doi.org/10.1038/s41467-023-41980-6
spellingShingle Emily Alsentzer
Samuel G. Finlayson
Michelle M. Li
Undiagnosed Diseases Network
Shilpa N. Kobren
Isaac S. Kohane
Simulation of undiagnosed patients with novel genetic conditions
Nature Communications
title Simulation of undiagnosed patients with novel genetic conditions
title_full Simulation of undiagnosed patients with novel genetic conditions
title_fullStr Simulation of undiagnosed patients with novel genetic conditions
title_full_unstemmed Simulation of undiagnosed patients with novel genetic conditions
title_short Simulation of undiagnosed patients with novel genetic conditions
title_sort simulation of undiagnosed patients with novel genetic conditions
url https://doi.org/10.1038/s41467-023-41980-6
work_keys_str_mv AT emilyalsentzer simulationofundiagnosedpatientswithnovelgeneticconditions
AT samuelgfinlayson simulationofundiagnosedpatientswithnovelgeneticconditions
AT michellemli simulationofundiagnosedpatientswithnovelgeneticconditions
AT undiagnoseddiseasesnetwork simulationofundiagnosedpatientswithnovelgeneticconditions
AT shilpankobren simulationofundiagnosedpatientswithnovelgeneticconditions
AT isaacskohane simulationofundiagnosedpatientswithnovelgeneticconditions