Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs
Conceptual distance refers to the degree of proximity between two concepts within a conceptualization. It is closely related to semantic similarity and relationships, but its measurement strongly depends on the context of the given concepts. DIS-C represents an advancement in the computation of sema...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/11/23/4806 |
_version_ | 1797399825498505216 |
---|---|
author | Rolando Quintero Esteban Mendiola Giovanni Guzmán Miguel Torres-Ruiz Carlos Guzmán Sánchez-Mejorada |
author_facet | Rolando Quintero Esteban Mendiola Giovanni Guzmán Miguel Torres-Ruiz Carlos Guzmán Sánchez-Mejorada |
author_sort | Rolando Quintero |
collection | DOAJ |
description | Conceptual distance refers to the degree of proximity between two concepts within a conceptualization. It is closely related to semantic similarity and relationships, but its measurement strongly depends on the context of the given concepts. DIS-C represents an advancement in the computation of semantic similarity/relationships that is independent of the type of knowledge structure and semantic relations when generating a graph from a knowledge base (ontologies, semantic networks, and hierarchies, among others). This approach determines the semantic similarity between two indirectly connected concepts in an ontology by propagating local distances by applying an algorithm based on the All Pairs Shortest Path (APSP) problem. This process is implemented for each pair of concepts to establish the most effective and efficient paths to connect these concepts. The algorithm identifies the shortest path between concepts, which allows for an inference of the most relevant relationships between them. However, one of the critical issues with this process is computational complexity, combined with the design of APSP algorithms, such as Dijkstra, which is <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>𝒪</mo><mfenced open="(" close=")"><msup><mi>n</mi><mn>3</mn></msup></mfenced></mrow></semantics></math></inline-formula>. This paper studies different alternatives to improve the DIS-C approach by adapting approximation algorithms, focusing on Dijkstra, pruned Dijkstra, and sketch-based methods, to compute the conceptual distance according to the need to scale DIS-C to analyze very large graphs; therefore, reducing the related computational complexity is critical. Tests were performed using different datasets to calculate the conceptual distance when using the original version of DIS-C and when using the influence area of nodes. In situations where time optimization is necessary for generating results, using the original DIS-C model is not the optimal method. Therefore, we propose a simplified version of DIS-C to calculate conceptual distances based on centrality estimation. The obtained results for the simple version of DIS-C indicated that the processing time decreased 2.381 times when compared to the original DIS-C version. Additionally, for both versions of DIS-C (normal and simple), the APSP algorithm decreased the computational cost when using a two-hop coverage-based approach. |
first_indexed | 2024-03-09T01:46:41Z |
format | Article |
id | doaj.art-2859137192a748079567b1ad196913fd |
institution | Directory Open Access Journal |
issn | 2227-7390 |
language | English |
last_indexed | 2024-03-09T01:46:41Z |
publishDate | 2023-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Mathematics |
spelling | doaj.art-2859137192a748079567b1ad196913fd2023-12-08T15:21:49ZengMDPI AGMathematics2227-73902023-11-011123480610.3390/math11234806Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge GraphsRolando Quintero0Esteban Mendiola1Giovanni Guzmán2Miguel Torres-Ruiz3Carlos Guzmán Sánchez-Mejorada4Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoCentro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoCentro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoCentro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoCentro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoConceptual distance refers to the degree of proximity between two concepts within a conceptualization. It is closely related to semantic similarity and relationships, but its measurement strongly depends on the context of the given concepts. DIS-C represents an advancement in the computation of semantic similarity/relationships that is independent of the type of knowledge structure and semantic relations when generating a graph from a knowledge base (ontologies, semantic networks, and hierarchies, among others). This approach determines the semantic similarity between two indirectly connected concepts in an ontology by propagating local distances by applying an algorithm based on the All Pairs Shortest Path (APSP) problem. This process is implemented for each pair of concepts to establish the most effective and efficient paths to connect these concepts. The algorithm identifies the shortest path between concepts, which allows for an inference of the most relevant relationships between them. However, one of the critical issues with this process is computational complexity, combined with the design of APSP algorithms, such as Dijkstra, which is <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>𝒪</mo><mfenced open="(" close=")"><msup><mi>n</mi><mn>3</mn></msup></mfenced></mrow></semantics></math></inline-formula>. This paper studies different alternatives to improve the DIS-C approach by adapting approximation algorithms, focusing on Dijkstra, pruned Dijkstra, and sketch-based methods, to compute the conceptual distance according to the need to scale DIS-C to analyze very large graphs; therefore, reducing the related computational complexity is critical. Tests were performed using different datasets to calculate the conceptual distance when using the original version of DIS-C and when using the influence area of nodes. In situations where time optimization is necessary for generating results, using the original DIS-C model is not the optimal method. Therefore, we propose a simplified version of DIS-C to calculate conceptual distances based on centrality estimation. The obtained results for the simple version of DIS-C indicated that the processing time decreased 2.381 times when compared to the original DIS-C version. Additionally, for both versions of DIS-C (normal and simple), the APSP algorithm decreased the computational cost when using a two-hop coverage-based approach.https://www.mdpi.com/2227-7390/11/23/4806conceptual distanceshortest path algorithmsaccelerated calculationcomputational complexity |
spellingShingle | Rolando Quintero Esteban Mendiola Giovanni Guzmán Miguel Torres-Ruiz Carlos Guzmán Sánchez-Mejorada Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs Mathematics conceptual distance shortest path algorithms accelerated calculation computational complexity |
title | Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs |
title_full | Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs |
title_fullStr | Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs |
title_full_unstemmed | Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs |
title_short | Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs |
title_sort | algorithm for the accelerated calculation of conceptual distances in large knowledge graphs |
topic | conceptual distance shortest path algorithms accelerated calculation computational complexity |
url | https://www.mdpi.com/2227-7390/11/23/4806 |
work_keys_str_mv | AT rolandoquintero algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs AT estebanmendiola algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs AT giovanniguzman algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs AT migueltorresruiz algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs AT carlosguzmansanchezmejorada algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs |