Enhancing Data Quality using Human Computation and Crowd Sourcing

This paper is aimed at addressing the issues that are present in the data dumps available at DBpedia by using the concept of associations i.e. concept hierarchy to enhance the quality of those data dumps. These data dumps are extracted from Wikipedia and the issues that prevail in these data dumps...

Full description

Bibliographic Details
Main Authors: Vikram Kumar Kirpalani, Muhammad Ejaz Tayab
Format: Article
Language:English
Published: Shaheed Zulfikar Ali Bhutto Institute of Science and Technology 2015-07-01
Series:JISR on Computing
Subjects:
Online Access:https://jisrc.szabist.edu.pk/ojs/index.php/jisrc/article/view/123
Description
Summary:This paper is aimed at addressing the issues that are present in the data dumps available at DBpedia by using the concept of associations i.e. concept hierarchy to enhance the quality of those data dumps. These data dumps are extracted from Wikipedia and the issues that prevail in these data dumps is because of either the data extraction frameworks or the human error during crowd-sourcing efforts made on Wikipedia. By using Human Computation techniques and employing Crowd sourcing together with query morphing, diving deeper into this subject would become easier in a better way. One of the key issues with the datasets is the presence of multiple values in a single attribute and vice versa especially in the “Place of Birth” field of important personalities. This paper highlights the implementation process in order to solve these issues and adds a survey conducted on Crowd Sourcing to highlight its impact.
ISSN:2412-0448
1998-4154