Efficient SPARQL Queries Generator for Question Answering Systems

Much like traditional database querying, the question answering process in a Question Answering (QA) system involves converting a user’s question input into query grammar, querying the knowledge base through the query grammar, and finally returning the query result (i.e., the answer) to t...

Full description

Bibliographic Details
Main Authors: Yi-Hui Chen, Eric Jui-Lin Lu, Ying-Yen Lin
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9893129/
_version_ 1811208581816516608
author Yi-Hui Chen
Eric Jui-Lin Lu
Ying-Yen Lin
author_facet Yi-Hui Chen
Eric Jui-Lin Lu
Ying-Yen Lin
author_sort Yi-Hui Chen
collection DOAJ
description Much like traditional database querying, the question answering process in a Question Answering (QA) system involves converting a user’s question input into query grammar, querying the knowledge base through the query grammar, and finally returning the query result (i.e., the answer) to the user. The accuracy of query grammar generation is therefore important in determining whether a Question Answering system can produce a correct answer. Generally speaking, incorrect query grammar will never find the right answer. SPARQL is the most frequently used query language in question answering systems. In the past, SPARQL was generated based on graph structures, such as dependency trees, syntax trees and so on. However, the query cost of generating SPARQL is high, which creates long processing times to answer questions. To reduce the query cost, this work proposes a low-cost SPARQL generator named Light-QAWizard, which integrates multi-label classification into a recurrent neural network (RNN), builds a template classifier, and generates corresponding query grammars based on the results of template classifier. Light-QAWizard reduces query frequency to DBpedia by aggregating multiple outputs into a single output using multi-label classification. In the experimental results, Light-QAWizard’s performance on Precision, Recall and F-measure metrics were evaluated on the QALD-7, QALD8 and QALD-9 datasets. Not only did Light-QAWizard outperform all other models, but it also had a lower query cost that was nearly half that of QAWizard.
first_indexed 2024-04-12T04:24:58Z
format Article
id doaj.art-cf66426cacbc4505925b919358a34346
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-12T04:24:58Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-cf66426cacbc4505925b919358a343462022-12-22T03:48:08ZengIEEEIEEE Access2169-35362022-01-0110998509986010.1109/ACCESS.2022.32067949893129Efficient SPARQL Queries Generator for Question Answering SystemsYi-Hui Chen0https://orcid.org/0000-0002-9932-0594Eric Jui-Lin Lu1https://orcid.org/0000-0001-7953-5486Ying-Yen Lin2Department of Information Management, Chang Gung University, Taoyuan, TaiwanDepartment of Management Information Systems, National Chung Hsing University, Taichung, TaiwanDepartment of Management Information Systems, National Chung Hsing University, Taichung, TaiwanMuch like traditional database querying, the question answering process in a Question Answering (QA) system involves converting a user’s question input into query grammar, querying the knowledge base through the query grammar, and finally returning the query result (i.e., the answer) to the user. The accuracy of query grammar generation is therefore important in determining whether a Question Answering system can produce a correct answer. Generally speaking, incorrect query grammar will never find the right answer. SPARQL is the most frequently used query language in question answering systems. In the past, SPARQL was generated based on graph structures, such as dependency trees, syntax trees and so on. However, the query cost of generating SPARQL is high, which creates long processing times to answer questions. To reduce the query cost, this work proposes a low-cost SPARQL generator named Light-QAWizard, which integrates multi-label classification into a recurrent neural network (RNN), builds a template classifier, and generates corresponding query grammars based on the results of template classifier. Light-QAWizard reduces query frequency to DBpedia by aggregating multiple outputs into a single output using multi-label classification. In the experimental results, Light-QAWizard’s performance on Precision, Recall and F-measure metrics were evaluated on the QALD-7, QALD8 and QALD-9 datasets. Not only did Light-QAWizard outperform all other models, but it also had a lower query cost that was nearly half that of QAWizard.https://ieeexplore.ieee.org/document/9893129/Question answering system (QA)SPARQL queryquery costrecurrent neural network (RNN)question answering over linked data (QALD)
spellingShingle Yi-Hui Chen
Eric Jui-Lin Lu
Ying-Yen Lin
Efficient SPARQL Queries Generator for Question Answering Systems
IEEE Access
Question answering system (QA)
SPARQL query
query cost
recurrent neural network (RNN)
question answering over linked data (QALD)
title Efficient SPARQL Queries Generator for Question Answering Systems
title_full Efficient SPARQL Queries Generator for Question Answering Systems
title_fullStr Efficient SPARQL Queries Generator for Question Answering Systems
title_full_unstemmed Efficient SPARQL Queries Generator for Question Answering Systems
title_short Efficient SPARQL Queries Generator for Question Answering Systems
title_sort efficient sparql queries generator for question answering systems
topic Question answering system (QA)
SPARQL query
query cost
recurrent neural network (RNN)
question answering over linked data (QALD)
url https://ieeexplore.ieee.org/document/9893129/
work_keys_str_mv AT yihuichen efficientsparqlqueriesgeneratorforquestionansweringsystems
AT ericjuilinlu efficientsparqlqueriesgeneratorforquestionansweringsystems
AT yingyenlin efficientsparqlqueriesgeneratorforquestionansweringsystems