Semankey: A Semantics-Driven Approach for Querying RDF Repositories Using Keywords

The Web of Data aims at linking Internet data repositories. Semantic Web technologies make data easily readable by computer agents, enabling the automation of complex tasks and facilitating data integration. They facilitate the achievement of the Web of Data in which users can query the connected da...

Full description

Bibliographic Details
Main Authors: Francisco Abad-Navarro, Catalina Martinez-Costa, Jesualdo Tomas Fernandez-Breis
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9462071/
Description
Summary:The Web of Data aims at linking Internet data repositories. Semantic Web technologies make data easily readable by computer agents, enabling the automation of complex tasks and facilitating data integration. They facilitate the achievement of the Web of Data in which users can query the connected datasets in the search engine style, i.e. by using keywords. However, querying semantic repositories in a friendly way, not requiring the mastering of query languages such as SPARQL, is still a challenging task. In this work, we present Semankey, an approach for the automatic building of SPARQL queries from a list of keywords entered by the user. Semankey identifies semantic entities in the keywords by using a domain ontology to interpret the query meaning and automatically builds a set of queries by connecting the entities through the relationships described in the ontology and by applying query size-based heuristics. The main contributions of Semankey are the use of query filters and the generation of multiple SPARQL queries derived from the different interpretations of the given input, according to the underlying domain ontology. We used the data from the Question Answering over Linked Data challenge for evaluating our approach in different execution modes and for analyzing the query trees generated, obtaining a precision of 0.52 and a recall of 0.60 when considering the best answer provided per test case.
ISSN:2169-3536