Automatic transparency evaluation for open knowledge extraction systems

Abstract Background This paper proposes Cyrus, a new transparency evaluation framework, for Open Knowledge Extraction (OKE) systems. Cyrus is based on the state-of-the-art transparency models and linked data quality assessment dimensions. It brings together a comprehensive view of transparency dimen...

Full description

Bibliographic Details
Main Authors:	Maryam Basereh, Annalina Caputo, Rob Brennan
Format:	Article
Language:	English
Published:	BMC 2023-08-01
Series:	Journal of Biomedical Semantics
Subjects:	Transparency framework Automatic transparency evaluation Open knowledge extraction FAIRness assessment Quality evaluation
Online Access:	https://doi.org/10.1186/s13326-023-00293-9

_version_	1797555588554555392
author	Maryam Basereh Annalina Caputo Rob Brennan
author_facet	Maryam Basereh Annalina Caputo Rob Brennan
author_sort	Maryam Basereh
collection	DOAJ
description	Abstract Background This paper proposes Cyrus, a new transparency evaluation framework, for Open Knowledge Extraction (OKE) systems. Cyrus is based on the state-of-the-art transparency models and linked data quality assessment dimensions. It brings together a comprehensive view of transparency dimensions for OKE systems. The Cyrus framework is used to evaluate the transparency of three linked datasets, which are built from the same corpus by three state-of-the-art OKE systems. The evaluation is automatically performed using a combination of three state-of-the-art FAIRness (Findability, Accessibility, Interoperability, Reusability) assessment tools and a linked data quality evaluation framework, called Luzzu. This evaluation includes six Cyrus data transparency dimensions for which existing assessment tools could be identified. OKE systems extract structured knowledge from unstructured or semi-structured text in the form of linked data. These systems are fundamental components of advanced knowledge services. However, due to the lack of a transparency framework for OKE, most OKE systems are not transparent. This means that their processes and outcomes are not understandable and interpretable. A comprehensive framework sheds light on different aspects of transparency, allows comparison between the transparency of different systems by supporting the development of transparency scores, gives insight into the transparency weaknesses of the system, and ways to improve them. Automatic transparency evaluation helps with scalability and facilitates transparency assessment. The transparency problem has been identified as critical by the European Union Trustworthy Artificial Intelligence (AI) guidelines. In this paper, Cyrus provides the first comprehensive view of transparency dimensions for OKE systems by merging the perspectives of the FAccT (Fairness, Accountability, and Transparency), FAIR, and linked data quality research communities. Results In Cyrus, data transparency includes ten dimensions which are grouped in two categories. In this paper, six of these dimensions, i.e., provenance, interpretability, understandability, licensing, availability, interlinking have been evaluated automatically for three state-of-the-art OKE systems, using the state-of-the-art metrics and tools. Covid-on-the-Web is identified to have the highest mean transparency. Conclusions This is the first research to study the transparency of OKE systems that provides a comprehensive set of transparency dimensions spanning ethics, trustworthy AI, and data quality approaches to transparency. It also demonstrates how to perform automated transparency evaluation that combines existing FAIRness and linked data quality assessment tools for the first time. We show that state-of-the-art OKE systems vary in the transparency of the linked data generated and that these differences can be automatically quantified leading to potential applications in trustworthy AI, compliance, data protection, data governance, and future OKE system design and testing.
first_indexed	2024-03-10T16:49:37Z
format	Article
id	doaj.art-3a123ff3f7054a45a4b340f113944134
institution	Directory Open Access Journal
issn	2041-1480
language	English
last_indexed	2024-03-10T16:49:37Z
publishDate	2023-08-01
publisher	BMC
record_format	Article
series	Journal of Biomedical Semantics
spelling	doaj.art-3a123ff3f7054a45a4b340f1139441342023-11-20T11:21:22ZengBMCJournal of Biomedical Semantics2041-14802023-08-0114111810.1186/s13326-023-00293-9Automatic transparency evaluation for open knowledge extraction systemsMaryam Basereh0Annalina Caputo1Rob Brennan2School of Computing, Dublin City UniversitySchool of Computing, Dublin City UniversityADAPT Centre, School of Computer Science, University College DublinAbstract Background This paper proposes Cyrus, a new transparency evaluation framework, for Open Knowledge Extraction (OKE) systems. Cyrus is based on the state-of-the-art transparency models and linked data quality assessment dimensions. It brings together a comprehensive view of transparency dimensions for OKE systems. The Cyrus framework is used to evaluate the transparency of three linked datasets, which are built from the same corpus by three state-of-the-art OKE systems. The evaluation is automatically performed using a combination of three state-of-the-art FAIRness (Findability, Accessibility, Interoperability, Reusability) assessment tools and a linked data quality evaluation framework, called Luzzu. This evaluation includes six Cyrus data transparency dimensions for which existing assessment tools could be identified. OKE systems extract structured knowledge from unstructured or semi-structured text in the form of linked data. These systems are fundamental components of advanced knowledge services. However, due to the lack of a transparency framework for OKE, most OKE systems are not transparent. This means that their processes and outcomes are not understandable and interpretable. A comprehensive framework sheds light on different aspects of transparency, allows comparison between the transparency of different systems by supporting the development of transparency scores, gives insight into the transparency weaknesses of the system, and ways to improve them. Automatic transparency evaluation helps with scalability and facilitates transparency assessment. The transparency problem has been identified as critical by the European Union Trustworthy Artificial Intelligence (AI) guidelines. In this paper, Cyrus provides the first comprehensive view of transparency dimensions for OKE systems by merging the perspectives of the FAccT (Fairness, Accountability, and Transparency), FAIR, and linked data quality research communities. Results In Cyrus, data transparency includes ten dimensions which are grouped in two categories. In this paper, six of these dimensions, i.e., provenance, interpretability, understandability, licensing, availability, interlinking have been evaluated automatically for three state-of-the-art OKE systems, using the state-of-the-art metrics and tools. Covid-on-the-Web is identified to have the highest mean transparency. Conclusions This is the first research to study the transparency of OKE systems that provides a comprehensive set of transparency dimensions spanning ethics, trustworthy AI, and data quality approaches to transparency. It also demonstrates how to perform automated transparency evaluation that combines existing FAIRness and linked data quality assessment tools for the first time. We show that state-of-the-art OKE systems vary in the transparency of the linked data generated and that these differences can be automatically quantified leading to potential applications in trustworthy AI, compliance, data protection, data governance, and future OKE system design and testing.https://doi.org/10.1186/s13326-023-00293-9Transparency frameworkAutomatic transparency evaluationOpen knowledge extractionFAIRness assessmentQuality evaluation
spellingShingle	Maryam Basereh Annalina Caputo Rob Brennan Automatic transparency evaluation for open knowledge extraction systems Journal of Biomedical Semantics Transparency framework Automatic transparency evaluation Open knowledge extraction FAIRness assessment Quality evaluation
title	Automatic transparency evaluation for open knowledge extraction systems
title_full	Automatic transparency evaluation for open knowledge extraction systems
title_fullStr	Automatic transparency evaluation for open knowledge extraction systems
title_full_unstemmed	Automatic transparency evaluation for open knowledge extraction systems
title_short	Automatic transparency evaluation for open knowledge extraction systems
title_sort	automatic transparency evaluation for open knowledge extraction systems
topic	Transparency framework Automatic transparency evaluation Open knowledge extraction FAIRness assessment Quality evaluation
url	https://doi.org/10.1186/s13326-023-00293-9
work_keys_str_mv	AT maryambasereh automatictransparencyevaluationforopenknowledgeextractionsystems AT annalinacaputo automatictransparencyevaluationforopenknowledgeextractionsystems AT robbrennan automatictransparencyevaluationforopenknowledgeextractionsystems

Automatic transparency evaluation for open knowledge extraction systems

Similar Items