A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology

Abstract Background Although rapid developed sequencing technologies make it possible for genotype data to be used in clinical diagnosis, it is still challenging for clinicians to understand the results of sequencing and make correct judgement based on them. Before this, diagnosis based on clinical...

Full description

Bibliographic Details
Main Authors: Xiaofeng Gong, Jianping Jiang, Zhongqu Duan, Hui Lu
Format: Article
Language:English
Published: BMC 2018-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-018-2064-y
_version_ 1819029146711359488
author Xiaofeng Gong
Jianping Jiang
Zhongqu Duan
Hui Lu
author_facet Xiaofeng Gong
Jianping Jiang
Zhongqu Duan
Hui Lu
author_sort Xiaofeng Gong
collection DOAJ
description Abstract Background Although rapid developed sequencing technologies make it possible for genotype data to be used in clinical diagnosis, it is still challenging for clinicians to understand the results of sequencing and make correct judgement based on them. Before this, diagnosis based on clinical features held a leading position. With the establishment of the Human Phenotype Ontology (HPO) and the enrichment of phenotype-disease annotations, there throws much more attention to the improvement of phenotype-based diagnosis. Results In this study, we presented a novel method called RelativeBestPair to measure similarity from the query terms to hereditary diseases based on HPO and then rank the candidate diseases. To evaluate the performance, we simulated a set of patients based on 44 complex diseases. Besides, by adding noise or imprecision or both, cases closer to real clinical conditions were generated. Thus, four simulated datasets were used to make comparison among RelativeBestPair and seven existing semantic similarity measures. RelativeBestPair ranked the underlying disease as top 1 on 93.73% of the simulated dataset without noise and imprecision, 93.64% of the simulated dataset with noise and without imprecision, 39.82% of the simulated dataset without noise and with imprecision, and 33.64% of the simulated dataset with both noise and imprecision. Conclusion Compared with the seven existing semantic similarity measures, RelativeBestPair showed similar performance in two datasets without imprecision. While RelativeBestPair appeared to be equal to Resnik and better than other six methods in the simulated dataset without noise and with imprecision, it significantly outperformed all other seven methods in the simulated dataset with both noise and imprecision. It can be indicated that RelativeBestPair might be of great help in clinical setting.
first_indexed 2024-12-21T06:09:37Z
format Article
id doaj.art-cd83fb6f4e984ef882c88e345c5280ce
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-21T06:09:37Z
publishDate 2018-05-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-cd83fb6f4e984ef882c88e345c5280ce2022-12-21T19:13:34ZengBMCBMC Bioinformatics1471-21052018-05-0119S411111910.1186/s12859-018-2064-yA new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontologyXiaofeng Gong0Jianping Jiang1Zhongqu Duan2Hui Lu3Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong UniversityDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong UniversityDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong UniversityDepartment of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong UniversityAbstract Background Although rapid developed sequencing technologies make it possible for genotype data to be used in clinical diagnosis, it is still challenging for clinicians to understand the results of sequencing and make correct judgement based on them. Before this, diagnosis based on clinical features held a leading position. With the establishment of the Human Phenotype Ontology (HPO) and the enrichment of phenotype-disease annotations, there throws much more attention to the improvement of phenotype-based diagnosis. Results In this study, we presented a novel method called RelativeBestPair to measure similarity from the query terms to hereditary diseases based on HPO and then rank the candidate diseases. To evaluate the performance, we simulated a set of patients based on 44 complex diseases. Besides, by adding noise or imprecision or both, cases closer to real clinical conditions were generated. Thus, four simulated datasets were used to make comparison among RelativeBestPair and seven existing semantic similarity measures. RelativeBestPair ranked the underlying disease as top 1 on 93.73% of the simulated dataset without noise and imprecision, 93.64% of the simulated dataset with noise and without imprecision, 39.82% of the simulated dataset without noise and with imprecision, and 33.64% of the simulated dataset with both noise and imprecision. Conclusion Compared with the seven existing semantic similarity measures, RelativeBestPair showed similar performance in two datasets without imprecision. While RelativeBestPair appeared to be equal to Resnik and better than other six methods in the simulated dataset without noise and with imprecision, it significantly outperformed all other seven methods in the simulated dataset with both noise and imprecision. It can be indicated that RelativeBestPair might be of great help in clinical setting.http://link.springer.com/article/10.1186/s12859-018-2064-yHuman phenotype ontology (HPO)Semantic similarityDiseaseDiagnosis
spellingShingle Xiaofeng Gong
Jianping Jiang
Zhongqu Duan
Hui Lu
A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology
BMC Bioinformatics
Human phenotype ontology (HPO)
Semantic similarity
Disease
Diagnosis
title A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology
title_full A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology
title_fullStr A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology
title_full_unstemmed A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology
title_short A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology
title_sort new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology
topic Human phenotype ontology (HPO)
Semantic similarity
Disease
Diagnosis
url http://link.springer.com/article/10.1186/s12859-018-2064-y
work_keys_str_mv AT xiaofenggong anewmethodtomeasurethesemanticsimilarityfromqueryphenotypicabnormalitiestodiseasesbasedonthehumanphenotypeontology
AT jianpingjiang anewmethodtomeasurethesemanticsimilarityfromqueryphenotypicabnormalitiestodiseasesbasedonthehumanphenotypeontology
AT zhongquduan anewmethodtomeasurethesemanticsimilarityfromqueryphenotypicabnormalitiestodiseasesbasedonthehumanphenotypeontology
AT huilu anewmethodtomeasurethesemanticsimilarityfromqueryphenotypicabnormalitiestodiseasesbasedonthehumanphenotypeontology
AT xiaofenggong newmethodtomeasurethesemanticsimilarityfromqueryphenotypicabnormalitiestodiseasesbasedonthehumanphenotypeontology
AT jianpingjiang newmethodtomeasurethesemanticsimilarityfromqueryphenotypicabnormalitiestodiseasesbasedonthehumanphenotypeontology
AT zhongquduan newmethodtomeasurethesemanticsimilarityfromqueryphenotypicabnormalitiestodiseasesbasedonthehumanphenotypeontology
AT huilu newmethodtomeasurethesemanticsimilarityfromqueryphenotypicabnormalitiestodiseasesbasedonthehumanphenotypeontology