Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs

Conceptual distance refers to the degree of proximity between two concepts within a conceptualization. It is closely related to semantic similarity and relationships, but its measurement strongly depends on the context of the given concepts. DIS-C represents an advancement in the computation of sema...

Full description

Bibliographic Details
Main Authors: Rolando Quintero, Esteban Mendiola, Giovanni Guzmán, Miguel Torres-Ruiz, Carlos Guzmán Sánchez-Mejorada
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/11/23/4806
_version_ 1797399825498505216
author Rolando Quintero
Esteban Mendiola
Giovanni Guzmán
Miguel Torres-Ruiz
Carlos Guzmán Sánchez-Mejorada
author_facet Rolando Quintero
Esteban Mendiola
Giovanni Guzmán
Miguel Torres-Ruiz
Carlos Guzmán Sánchez-Mejorada
author_sort Rolando Quintero
collection DOAJ
description Conceptual distance refers to the degree of proximity between two concepts within a conceptualization. It is closely related to semantic similarity and relationships, but its measurement strongly depends on the context of the given concepts. DIS-C represents an advancement in the computation of semantic similarity/relationships that is independent of the type of knowledge structure and semantic relations when generating a graph from a knowledge base (ontologies, semantic networks, and hierarchies, among others). This approach determines the semantic similarity between two indirectly connected concepts in an ontology by propagating local distances by applying an algorithm based on the All Pairs Shortest Path (APSP) problem. This process is implemented for each pair of concepts to establish the most effective and efficient paths to connect these concepts. The algorithm identifies the shortest path between concepts, which allows for an inference of the most relevant relationships between them. However, one of the critical issues with this process is computational complexity, combined with the design of APSP algorithms, such as Dijkstra, which is <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>𝒪</mo><mfenced open="(" close=")"><msup><mi>n</mi><mn>3</mn></msup></mfenced></mrow></semantics></math></inline-formula>. This paper studies different alternatives to improve the DIS-C approach by adapting approximation algorithms, focusing on Dijkstra, pruned Dijkstra, and sketch-based methods, to compute the conceptual distance according to the need to scale DIS-C to analyze very large graphs; therefore, reducing the related computational complexity is critical. Tests were performed using different datasets to calculate the conceptual distance when using the original version of DIS-C and when using the influence area of nodes. In situations where time optimization is necessary for generating results, using the original DIS-C model is not the optimal method. Therefore, we propose a simplified version of DIS-C to calculate conceptual distances based on centrality estimation. The obtained results for the simple version of DIS-C indicated that the processing time decreased 2.381 times when compared to the original DIS-C version. Additionally, for both versions of DIS-C (normal and simple), the APSP algorithm decreased the computational cost when using a two-hop coverage-based approach.
first_indexed 2024-03-09T01:46:41Z
format Article
id doaj.art-2859137192a748079567b1ad196913fd
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-09T01:46:41Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-2859137192a748079567b1ad196913fd2023-12-08T15:21:49ZengMDPI AGMathematics2227-73902023-11-011123480610.3390/math11234806Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge GraphsRolando Quintero0Esteban Mendiola1Giovanni Guzmán2Miguel Torres-Ruiz3Carlos Guzmán Sánchez-Mejorada4Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoCentro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoCentro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoCentro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoCentro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Unidad Profesional Adolfo López Mateos (UPALM)-Zacatenco, Mexico City 07320, MexicoConceptual distance refers to the degree of proximity between two concepts within a conceptualization. It is closely related to semantic similarity and relationships, but its measurement strongly depends on the context of the given concepts. DIS-C represents an advancement in the computation of semantic similarity/relationships that is independent of the type of knowledge structure and semantic relations when generating a graph from a knowledge base (ontologies, semantic networks, and hierarchies, among others). This approach determines the semantic similarity between two indirectly connected concepts in an ontology by propagating local distances by applying an algorithm based on the All Pairs Shortest Path (APSP) problem. This process is implemented for each pair of concepts to establish the most effective and efficient paths to connect these concepts. The algorithm identifies the shortest path between concepts, which allows for an inference of the most relevant relationships between them. However, one of the critical issues with this process is computational complexity, combined with the design of APSP algorithms, such as Dijkstra, which is <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>𝒪</mo><mfenced open="(" close=")"><msup><mi>n</mi><mn>3</mn></msup></mfenced></mrow></semantics></math></inline-formula>. This paper studies different alternatives to improve the DIS-C approach by adapting approximation algorithms, focusing on Dijkstra, pruned Dijkstra, and sketch-based methods, to compute the conceptual distance according to the need to scale DIS-C to analyze very large graphs; therefore, reducing the related computational complexity is critical. Tests were performed using different datasets to calculate the conceptual distance when using the original version of DIS-C and when using the influence area of nodes. In situations where time optimization is necessary for generating results, using the original DIS-C model is not the optimal method. Therefore, we propose a simplified version of DIS-C to calculate conceptual distances based on centrality estimation. The obtained results for the simple version of DIS-C indicated that the processing time decreased 2.381 times when compared to the original DIS-C version. Additionally, for both versions of DIS-C (normal and simple), the APSP algorithm decreased the computational cost when using a two-hop coverage-based approach.https://www.mdpi.com/2227-7390/11/23/4806conceptual distanceshortest path algorithmsaccelerated calculationcomputational complexity
spellingShingle Rolando Quintero
Esteban Mendiola
Giovanni Guzmán
Miguel Torres-Ruiz
Carlos Guzmán Sánchez-Mejorada
Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs
Mathematics
conceptual distance
shortest path algorithms
accelerated calculation
computational complexity
title Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs
title_full Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs
title_fullStr Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs
title_full_unstemmed Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs
title_short Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs
title_sort algorithm for the accelerated calculation of conceptual distances in large knowledge graphs
topic conceptual distance
shortest path algorithms
accelerated calculation
computational complexity
url https://www.mdpi.com/2227-7390/11/23/4806
work_keys_str_mv AT rolandoquintero algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs
AT estebanmendiola algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs
AT giovanniguzman algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs
AT migueltorresruiz algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs
AT carlosguzmansanchezmejorada algorithmfortheacceleratedcalculationofconceptualdistancesinlargeknowledgegraphs