Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records

Human genetic disorders, such as Down syndrome, have a wide variety of clinical phenotypic presentations, and characterizing each nuanced phenotype and subtype can be difficult. In this study, we examined the electronic health records of 4095 individuals with Down syndrome at the Children’s Hospital...

Full description

Bibliographic Details
Main Authors: James Margolin Havrilla, Mengge Zhao, Cong Liu, Chunhua Weng, Ingo Helbig, Elizabeth Bhoj, Kai Wang
Format: Article
Language:English
Published: MDPI AG 2021-07-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/12/8/1159
_version_ 1797523779308486656
author James Margolin Havrilla
Mengge Zhao
Cong Liu
Chunhua Weng
Ingo Helbig
Elizabeth Bhoj
Kai Wang
author_facet James Margolin Havrilla
Mengge Zhao
Cong Liu
Chunhua Weng
Ingo Helbig
Elizabeth Bhoj
Kai Wang
author_sort James Margolin Havrilla
collection DOAJ
description Human genetic disorders, such as Down syndrome, have a wide variety of clinical phenotypic presentations, and characterizing each nuanced phenotype and subtype can be difficult. In this study, we examined the electronic health records of 4095 individuals with Down syndrome at the Children’s Hospital of Philadelphia to create a method to characterize the phenotypic spectrum digitally. We extracted Human Phenotype Ontology (HPO) terms from quality-filtered patient notes using a natural language processing (NLP) approach MetaMap. We catalogued the most common HPO terms related to Down syndrome patients and compared the terms with those from a baseline population. We characterized the top 100 HPO terms by their frequencies at different ages of clinical visits and highlighted selected terms that have time-dependent distributions. We also discovered phenotypic terms that have not been significantly associated with Down syndrome, such as “Proptosis”, “Downslanted palpebral fissures”, and “Microtia”. In summary, our study demonstrated that the clinical phenotypic spectrum of individual with Mendelian diseases can be characterized through NLP-based digital phenotyping on population-scale electronic health records (EHRs).
first_indexed 2024-03-10T08:47:54Z
format Article
id doaj.art-8d4944dce4fd46be91a172c243975b24
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-10T08:47:54Z
publishDate 2021-07-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-8d4944dce4fd46be91a172c243975b242023-11-22T07:45:21ZengMDPI AGGenes2073-44252021-07-01128115910.3390/genes12081159Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health RecordsJames Margolin Havrilla0Mengge Zhao1Cong Liu2Chunhua Weng3Ingo Helbig4Elizabeth Bhoj5Kai Wang6Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USACenter for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USADepartment of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY 10032, USADepartment of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY 10032, USADepartment of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USADivision of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USACenter for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USAHuman genetic disorders, such as Down syndrome, have a wide variety of clinical phenotypic presentations, and characterizing each nuanced phenotype and subtype can be difficult. In this study, we examined the electronic health records of 4095 individuals with Down syndrome at the Children’s Hospital of Philadelphia to create a method to characterize the phenotypic spectrum digitally. We extracted Human Phenotype Ontology (HPO) terms from quality-filtered patient notes using a natural language processing (NLP) approach MetaMap. We catalogued the most common HPO terms related to Down syndrome patients and compared the terms with those from a baseline population. We characterized the top 100 HPO terms by their frequencies at different ages of clinical visits and highlighted selected terms that have time-dependent distributions. We also discovered phenotypic terms that have not been significantly associated with Down syndrome, such as “Proptosis”, “Downslanted palpebral fissures”, and “Microtia”. In summary, our study demonstrated that the clinical phenotypic spectrum of individual with Mendelian diseases can be characterized through NLP-based digital phenotyping on population-scale electronic health records (EHRs).https://www.mdpi.com/2073-4425/12/8/1159Down syndromephenotypeelectronic health recordsphenotypic spectrumlongitudinal studynatural language processing
spellingShingle James Margolin Havrilla
Mengge Zhao
Cong Liu
Chunhua Weng
Ingo Helbig
Elizabeth Bhoj
Kai Wang
Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records
Genes
Down syndrome
phenotype
electronic health records
phenotypic spectrum
longitudinal study
natural language processing
title Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records
title_full Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records
title_fullStr Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records
title_full_unstemmed Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records
title_short Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records
title_sort clinical phenotypic spectrum of 4095 individuals with down syndrome from text mining of electronic health records
topic Down syndrome
phenotype
electronic health records
phenotypic spectrum
longitudinal study
natural language processing
url https://www.mdpi.com/2073-4425/12/8/1159
work_keys_str_mv AT jamesmargolinhavrilla clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords
AT menggezhao clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords
AT congliu clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords
AT chunhuaweng clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords
AT ingohelbig clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords
AT elizabethbhoj clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords
AT kaiwang clinicalphenotypicspectrumof4095individualswithdownsyndromefromtextminingofelectronichealthrecords