Multi-domain semantic similarity in biomedical research

Abstract Background Given the increasing amount of biomedical resources that are being annotated with concepts from more than one ontology and covering multiple domains of knowledge, it is important to devise mechanisms to compare these resources that take into account the various domains of annotat...

Full description

Bibliographic Details
Main Authors: João D. Ferreira, Francisco M. Couto
Format: Article
Language:English
Published: BMC 2019-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2810-9
_version_ 1818503254369107968
author João D. Ferreira
Francisco M. Couto
author_facet João D. Ferreira
Francisco M. Couto
author_sort João D. Ferreira
collection DOAJ
description Abstract Background Given the increasing amount of biomedical resources that are being annotated with concepts from more than one ontology and covering multiple domains of knowledge, it is important to devise mechanisms to compare these resources that take into account the various domains of annotation. For example, metabolic pathways are annotated with their enzymes and their metabolites, and thus similarity measures should compare them with respect to both of those domains simultaneously. Results In this paper, we propose two approaches to lift existing single-ontology semantic similarity measures into multi-domain measures. The aggregative approach compares domains independently and averages the various similarity values into a final score. The integrative approach integrates all the relevant ontologies into a single one, calculating similarity in the resulting multi-domain ontology using the single-ontology measure. Conclusions We evaluated the two approaches in a multidisciplinary epidemiology dataset by evaluating the capacity of the similarity measures to predict new annotations based on the existing ones. The results show a promising increase in performance of the multi-domain measures over the single-ontology ones in the vast majority of the cases. These results show that multi-domain measures outperform single-domain ones, and should be considered by the community as a starting point to study more efficient multi-domain semantic similarity measures.
first_indexed 2024-12-10T21:21:37Z
format Article
id doaj.art-e4f5331a245040679b79e2d345959aab
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-10T21:21:37Z
publishDate 2019-05-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-e4f5331a245040679b79e2d345959aab2022-12-22T01:33:06ZengBMCBMC Bioinformatics1471-21052019-05-0120S10233110.1186/s12859-019-2810-9Multi-domain semantic similarity in biomedical researchJoão D. Ferreira0Francisco M. Couto1LASIGE, Faculdade de Ciências, Universidade de LisboaLASIGE, Faculdade de Ciências, Universidade de LisboaAbstract Background Given the increasing amount of biomedical resources that are being annotated with concepts from more than one ontology and covering multiple domains of knowledge, it is important to devise mechanisms to compare these resources that take into account the various domains of annotation. For example, metabolic pathways are annotated with their enzymes and their metabolites, and thus similarity measures should compare them with respect to both of those domains simultaneously. Results In this paper, we propose two approaches to lift existing single-ontology semantic similarity measures into multi-domain measures. The aggregative approach compares domains independently and averages the various similarity values into a final score. The integrative approach integrates all the relevant ontologies into a single one, calculating similarity in the resulting multi-domain ontology using the single-ontology measure. Conclusions We evaluated the two approaches in a multidisciplinary epidemiology dataset by evaluating the capacity of the similarity measures to predict new annotations based on the existing ones. The results show a promising increase in performance of the multi-domain measures over the single-ontology ones in the vast majority of the cases. These results show that multi-domain measures outperform single-domain ones, and should be considered by the community as a starting point to study more efficient multi-domain semantic similarity measures.http://link.springer.com/article/10.1186/s12859-019-2810-9Semantic similarityEpidemiologyMultiple-domainPrediction
spellingShingle João D. Ferreira
Francisco M. Couto
Multi-domain semantic similarity in biomedical research
BMC Bioinformatics
Semantic similarity
Epidemiology
Multiple-domain
Prediction
title Multi-domain semantic similarity in biomedical research
title_full Multi-domain semantic similarity in biomedical research
title_fullStr Multi-domain semantic similarity in biomedical research
title_full_unstemmed Multi-domain semantic similarity in biomedical research
title_short Multi-domain semantic similarity in biomedical research
title_sort multi domain semantic similarity in biomedical research
topic Semantic similarity
Epidemiology
Multiple-domain
Prediction
url http://link.springer.com/article/10.1186/s12859-019-2810-9
work_keys_str_mv AT joaodferreira multidomainsemanticsimilarityinbiomedicalresearch
AT franciscomcouto multidomainsemanticsimilarityinbiomedicalresearch