Simulation of undiagnosed patients with novel genetic conditions
Abstract Rare Mendelian disorders pose a major diagnostic challenge and collectively affect 300–400 million patients worldwide. Many automated tools aim to uncover causal genes in patients with suspected genetic disorders, but evaluation of these tools is limited due to the lack of comprehensive ben...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-10-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-023-41980-6 |
_version_ | 1797558730158505984 |
---|---|
author | Emily Alsentzer Samuel G. Finlayson Michelle M. Li Undiagnosed Diseases Network Shilpa N. Kobren Isaac S. Kohane |
author_facet | Emily Alsentzer Samuel G. Finlayson Michelle M. Li Undiagnosed Diseases Network Shilpa N. Kobren Isaac S. Kohane |
author_sort | Emily Alsentzer |
collection | DOAJ |
description | Abstract Rare Mendelian disorders pose a major diagnostic challenge and collectively affect 300–400 million patients worldwide. Many automated tools aim to uncover causal genes in patients with suspected genetic disorders, but evaluation of these tools is limited due to the lack of comprehensive benchmark datasets that include previously unpublished conditions. Here, we present a computational pipeline that simulates realistic clinical datasets to address this deficit. Our framework jointly simulates complex phenotypes and challenging candidate genes and produces patients with novel genetic conditions. We demonstrate the similarity of our simulated patients to real patients from the Undiagnosed Diseases Network and evaluate common gene prioritization methods on the simulated cohort. These prioritization methods recover known gene-disease associations but perform poorly on diagnosing patients with novel genetic disorders. Our publicly-available dataset and codebase can be utilized by medical genetics researchers to evaluate, compare, and improve tools that aid in the diagnostic process. |
first_indexed | 2024-03-10T17:35:37Z |
format | Article |
id | doaj.art-83597d6332fa45a3a95b8e4a66d94735 |
institution | Directory Open Access Journal |
issn | 2041-1723 |
language | English |
last_indexed | 2024-03-10T17:35:37Z |
publishDate | 2023-10-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj.art-83597d6332fa45a3a95b8e4a66d947352023-11-20T09:52:43ZengNature PortfolioNature Communications2041-17232023-10-0114111310.1038/s41467-023-41980-6Simulation of undiagnosed patients with novel genetic conditionsEmily Alsentzer0Samuel G. Finlayson1Michelle M. Li2Undiagnosed Diseases NetworkShilpa N. Kobren3Isaac S. Kohane4Department of Biomedical Informatics, Harvard Medical SchoolDepartment of Biomedical Informatics, Harvard Medical SchoolDepartment of Biomedical Informatics, Harvard Medical SchoolDepartment of Biomedical Informatics, Harvard Medical SchoolDepartment of Biomedical Informatics, Harvard Medical SchoolAbstract Rare Mendelian disorders pose a major diagnostic challenge and collectively affect 300–400 million patients worldwide. Many automated tools aim to uncover causal genes in patients with suspected genetic disorders, but evaluation of these tools is limited due to the lack of comprehensive benchmark datasets that include previously unpublished conditions. Here, we present a computational pipeline that simulates realistic clinical datasets to address this deficit. Our framework jointly simulates complex phenotypes and challenging candidate genes and produces patients with novel genetic conditions. We demonstrate the similarity of our simulated patients to real patients from the Undiagnosed Diseases Network and evaluate common gene prioritization methods on the simulated cohort. These prioritization methods recover known gene-disease associations but perform poorly on diagnosing patients with novel genetic disorders. Our publicly-available dataset and codebase can be utilized by medical genetics researchers to evaluate, compare, and improve tools that aid in the diagnostic process.https://doi.org/10.1038/s41467-023-41980-6 |
spellingShingle | Emily Alsentzer Samuel G. Finlayson Michelle M. Li Undiagnosed Diseases Network Shilpa N. Kobren Isaac S. Kohane Simulation of undiagnosed patients with novel genetic conditions Nature Communications |
title | Simulation of undiagnosed patients with novel genetic conditions |
title_full | Simulation of undiagnosed patients with novel genetic conditions |
title_fullStr | Simulation of undiagnosed patients with novel genetic conditions |
title_full_unstemmed | Simulation of undiagnosed patients with novel genetic conditions |
title_short | Simulation of undiagnosed patients with novel genetic conditions |
title_sort | simulation of undiagnosed patients with novel genetic conditions |
url | https://doi.org/10.1038/s41467-023-41980-6 |
work_keys_str_mv | AT emilyalsentzer simulationofundiagnosedpatientswithnovelgeneticconditions AT samuelgfinlayson simulationofundiagnosedpatientswithnovelgeneticconditions AT michellemli simulationofundiagnosedpatientswithnovelgeneticconditions AT undiagnoseddiseasesnetwork simulationofundiagnosedpatientswithnovelgeneticconditions AT shilpankobren simulationofundiagnosedpatientswithnovelgeneticconditions AT isaacskohane simulationofundiagnosedpatientswithnovelgeneticconditions |