Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records
Human genetic disorders, such as Down syndrome, have a wide variety of clinical phenotypic presentations, and characterizing each nuanced phenotype and subtype can be difficult. In this study, we examined the electronic health records of 4095 individuals with Down syndrome at the Children’s Hospital...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-07-01
|
Series: | Genes |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4425/12/8/1159 |
_version_ | 1797523779308486656 |
---|---|
author | James Margolin Havrilla Mengge Zhao Cong Liu Chunhua Weng Ingo Helbig Elizabeth Bhoj Kai Wang |
author_facet | James Margolin Havrilla Mengge Zhao Cong Liu Chunhua Weng Ingo Helbig Elizabeth Bhoj Kai Wang |
author_sort | James Margolin Havrilla |
collection | DOAJ |
description | Human genetic disorders, such as Down syndrome, have a wide variety of clinical phenotypic presentations, and characterizing each nuanced phenotype and subtype can be difficult. In this study, we examined the electronic health records of 4095 individuals with Down syndrome at the Children’s Hospital of Philadelphia to create a method to characterize the phenotypic spectrum digitally. We extracted Human Phenotype Ontology (HPO) terms from quality-filtered patient notes using a natural language processing (NLP) approach MetaMap. We catalogued the most common HPO terms related to Down syndrome patients and compared the terms with those from a baseline population. We characterized the top 100 HPO terms by their frequencies at different ages of clinical visits and highlighted selected terms that have time-dependent distributions. We also discovered phenotypic terms that have not been significantly associated with Down syndrome, such as “Proptosis”, “Downslanted palpebral fissures”, and “Microtia”. In summary, our study demonstrated that the clinical phenotypic spectrum of individual with Mendelian diseases can be characterized through NLP-based digital phenotyping on population-scale electronic health records (EHRs). |
first_indexed | 2024-03-10T08:47:54Z |
format | Article |
id | doaj.art-8d4944dce4fd46be91a172c243975b24 |
institution | Directory Open Access Journal |
issn | 2073-4425 |
language | English |
last_indexed | 2024-03-10T08:47:54Z |
publishDate | 2021-07-01 |
publisher | MDPI AG |
record_format | Article |
series | Genes |
spelling | doaj.art-8d4944dce4fd46be91a172c243975b242023-11-22T07:45:21ZengMDPI AGGenes2073-44252021-07-01128115910.3390/genes12081159Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health RecordsJames Margolin Havrilla0Mengge Zhao1Cong Liu2Chunhua Weng3Ingo Helbig4Elizabeth Bhoj5Kai Wang6Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USACenter for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USADepartment of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY 10032, USADepartment of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY 10032, USADepartment of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USADivision of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USACenter for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USAHuman genetic disorders, such as Down syndrome, have a wide variety of clinical phenotypic presentations, and characterizing each nuanced phenotype and subtype can be difficult. In this study, we examined the electronic health records of 4095 individuals with Down syndrome at the Children’s Hospital of Philadelphia to create a method to characterize the phenotypic spectrum digitally. We extracted Human Phenotype Ontology (HPO) terms from quality-filtered patient notes using a natural language processing (NLP) approach MetaMap. We catalogued the most common HPO terms related to Down syndrome patients and compared the terms with those from a baseline population. We characterized the top 100 HPO terms by their frequencies at different ages of clinical visits and highlighted selected terms that have time-dependent distributions. We also discovered phenotypic terms that have not been significantly associated with Down syndrome, such as “Proptosis”, “Downslanted palpebral fissures”, and “Microtia”. In summary, our study demonstrated that the clinical phenotypic spectrum of individual with Mendelian diseases can be characterized through NLP-based digital phenotyping on population-scale electronic health records (EHRs).https://www.mdpi.com/2073-4425/12/8/1159Down syndromephenotypeelectronic health recordsphenotypic spectrumlongitudinal studynatural language processing |
spellingShingle | James Margolin Havrilla Mengge Zhao Cong Liu Chunhua Weng Ingo Helbig Elizabeth Bhoj Kai Wang Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records Genes Down syndrome phenotype electronic health records phenotypic spectrum longitudinal study natural language processing |
title | Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records |
title_full | Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records |
title_fullStr | Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records |
title_full_unstemmed | Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records |
title_short | Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records |
title_sort | clinical phenotypic spectrum of 4095 individuals with down syndrome from text mining of electronic health records |
topic | Down syndrome phenotype electronic health records phenotypic spectrum longitudinal study natural language processing |
url | https://www.mdpi.com/2073-4425/12/8/1159 |
work_keys_str_mv | AT jamesmargolinhavrilla clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords AT menggezhao clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords AT congliu clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords AT chunhuaweng clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords AT ingohelbig clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords AT elizabethbhoj clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords AT kaiwang clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords |