How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases
The quality assurance of publication data in collaborative knowledge bases and in current research information systems (CRIS) becomes more and more relevant by the use of freely available spatial information in different application scenarios. When integrating this data into CRIS, it is necessary to...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-04-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/13/5/107 |
_version_ | 1797569557268791296 |
---|---|
author | Otmane Azeroual Włodzimierz Lewoniewski |
author_facet | Otmane Azeroual Włodzimierz Lewoniewski |
author_sort | Otmane Azeroual |
collection | DOAJ |
description | The quality assurance of publication data in collaborative knowledge bases and in current research information systems (CRIS) becomes more and more relevant by the use of freely available spatial information in different application scenarios. When integrating this data into CRIS, it is necessary to be able to recognize and assess their quality. Only then is it possible to compile a result from the available data that fulfills its purpose for the user, namely to deliver reliable data and information. This paper discussed the quality problems of source metadata in Wikipedia and CRIS. Based on real data from over 40 million Wikipedia articles in various languages, we performed preliminary quality analysis of the metadata of scientific publications using a data quality tool. So far, no data quality measurements have been programmed with Python to assess the quality of metadata from scientific publications in Wikipedia and CRIS. With this in mind, we programmed the methods and algorithms as code, but presented it in the form of pseudocode in this paper to measure the quality related to objective data quality dimensions such as completeness, correctness, consistency, and timeliness. This was prepared as a macro service so that the users can use the measurement results with the program code to make a statement about their scientific publications metadata so that the management can rely on high-quality data when making decisions. |
first_indexed | 2024-03-10T20:13:07Z |
format | Article |
id | doaj.art-46e3109ebd72472584af02202640c15d |
institution | Directory Open Access Journal |
issn | 1999-4893 |
language | English |
last_indexed | 2024-03-10T20:13:07Z |
publishDate | 2020-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Algorithms |
spelling | doaj.art-46e3109ebd72472584af02202640c15d2023-11-19T22:45:36ZengMDPI AGAlgorithms1999-48932020-04-0113510710.3390/a13050107How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS DatabasesOtmane Azeroual0Włodzimierz Lewoniewski1German Centre for Higher Education Research and Science Studies (DZHW), 10117 Berlin, GermanyDepartment of Information Systems, Poznań University of Economics and Business, 61-875 Poznań, PolandThe quality assurance of publication data in collaborative knowledge bases and in current research information systems (CRIS) becomes more and more relevant by the use of freely available spatial information in different application scenarios. When integrating this data into CRIS, it is necessary to be able to recognize and assess their quality. Only then is it possible to compile a result from the available data that fulfills its purpose for the user, namely to deliver reliable data and information. This paper discussed the quality problems of source metadata in Wikipedia and CRIS. Based on real data from over 40 million Wikipedia articles in various languages, we performed preliminary quality analysis of the metadata of scientific publications using a data quality tool. So far, no data quality measurements have been programmed with Python to assess the quality of metadata from scientific publications in Wikipedia and CRIS. With this in mind, we programmed the methods and algorithms as code, but presented it in the form of pseudocode in this paper to measure the quality related to objective data quality dimensions such as completeness, correctness, consistency, and timeliness. This was prepared as a macro service so that the users can use the measurement results with the program code to make a statement about their scientific publications metadata so that the management can rely on high-quality data when making decisions.https://www.mdpi.com/1999-4893/13/5/107Wikipediacurrent research information systems (CRIS)publications datadata qualityobjective quality dimensionsresearch data processing |
spellingShingle | Otmane Azeroual Włodzimierz Lewoniewski How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases Algorithms Wikipedia current research information systems (CRIS) publications data data quality objective quality dimensions research data processing |
title | How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases |
title_full | How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases |
title_fullStr | How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases |
title_full_unstemmed | How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases |
title_short | How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases |
title_sort | how to inspect and measure data quality about scientific publications use case of wikipedia and cris databases |
topic | Wikipedia current research information systems (CRIS) publications data data quality objective quality dimensions research data processing |
url | https://www.mdpi.com/1999-4893/13/5/107 |
work_keys_str_mv | AT otmaneazeroual howtoinspectandmeasuredataqualityaboutscientificpublicationsusecaseofwikipediaandcrisdatabases AT włodzimierzlewoniewski howtoinspectandmeasuredataqualityaboutscientificpublicationsusecaseofwikipediaandcrisdatabases |