The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

Abstract Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were...

Full description

Bibliographic Details
Main Authors: Sören Auer, Dante A. C. Barone, Cassiano Bartz, Eduardo G. Cortes, Mohamad Yaser Jaradeh, Oliver Karras, Manolis Koubarakis, Dmitry Mouromtsev, Dmitrii Pliukhin, Daniil Radyush, Ivan Shilin, Markus Stocker, Eleni Tsalapati
Format: Article
Language:English
Published: Nature Portfolio 2023-05-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-33607-z
_version_ 1797832072091402240
author Sören Auer
Dante A. C. Barone
Cassiano Bartz
Eduardo G. Cortes
Mohamad Yaser Jaradeh
Oliver Karras
Manolis Koubarakis
Dmitry Mouromtsev
Dmitrii Pliukhin
Daniil Radyush
Ivan Shilin
Markus Stocker
Eleni Tsalapati
author_facet Sören Auer
Dante A. C. Barone
Cassiano Bartz
Eduardo G. Cortes
Mohamad Yaser Jaradeh
Oliver Karras
Manolis Koubarakis
Dmitry Mouromtsev
Dmitrii Pliukhin
Daniil Radyush
Ivan Shilin
Markus Stocker
Eleni Tsalapati
author_sort Sören Auer
collection DOAJ
description Abstract Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.
first_indexed 2024-04-09T14:01:53Z
format Article
id doaj.art-1d163947d46245b68ce37cac348aab58
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-04-09T14:01:53Z
publishDate 2023-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-1d163947d46245b68ce37cac348aab582023-05-07T11:12:47ZengNature PortfolioScientific Reports2045-23222023-05-0113111610.1038/s41598-023-33607-zThe SciQA Scientific Question Answering Benchmark for Scholarly KnowledgeSören Auer0Dante A. C. Barone1Cassiano Bartz2Eduardo G. Cortes3Mohamad Yaser Jaradeh4Oliver Karras5Manolis Koubarakis6Dmitry Mouromtsev7Dmitrii Pliukhin8Daniil Radyush9Ivan Shilin10Markus Stocker11Eleni Tsalapati12TIB—Leibniz Information Centre for Science and TechnologyInstitute of Informatics, Federal University of Rio Grande do SulInstitute of Informatics, Federal University of Rio Grande do SulInstitute of Informatics, Federal University of Rio Grande do SulTIB—Leibniz Information Centre for Science and TechnologyTIB—Leibniz Information Centre for Science and TechnologyDepartment of Informatics and Telecommunications, National and Kapodistrian University of AthensLaboratory of Information Science and Semantic Technologies, ITMO UniversityLaboratory of Information Science and Semantic Technologies, ITMO UniversityLaboratory of Information Science and Semantic Technologies, ITMO UniversityLaboratory of Information Science and Semantic Technologies, ITMO UniversityTIB—Leibniz Information Centre for Science and TechnologyDepartment of Informatics and Telecommunications, National and Kapodistrian University of AthensAbstract Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.https://doi.org/10.1038/s41598-023-33607-z
spellingShingle Sören Auer
Dante A. C. Barone
Cassiano Bartz
Eduardo G. Cortes
Mohamad Yaser Jaradeh
Oliver Karras
Manolis Koubarakis
Dmitry Mouromtsev
Dmitrii Pliukhin
Daniil Radyush
Ivan Shilin
Markus Stocker
Eleni Tsalapati
The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge
Scientific Reports
title The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge
title_full The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge
title_fullStr The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge
title_full_unstemmed The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge
title_short The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge
title_sort sciqa scientific question answering benchmark for scholarly knowledge
url https://doi.org/10.1038/s41598-023-33607-z
work_keys_str_mv AT sorenauer thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT danteacbarone thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT cassianobartz thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT eduardogcortes thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT mohamadyaserjaradeh thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT oliverkarras thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT manoliskoubarakis thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT dmitrymouromtsev thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT dmitriipliukhin thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT daniilradyush thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT ivanshilin thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT markusstocker thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT elenitsalapati thesciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT sorenauer sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT danteacbarone sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT cassianobartz sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT eduardogcortes sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT mohamadyaserjaradeh sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT oliverkarras sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT manoliskoubarakis sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT dmitrymouromtsev sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT dmitriipliukhin sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT daniilradyush sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT ivanshilin sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT markusstocker sciqascientificquestionansweringbenchmarkforscholarlyknowledge
AT elenitsalapati sciqascientificquestionansweringbenchmarkforscholarlyknowledge