Data Quality Automation: a Generic Approach for Large Linked Research Datasets
Introduction When datasets are collected mainly for administrative rather than research purposes, data quality checks are necessary to ensure robust findings and to avoid biased results due to incomplete or inaccurate data. When done manually, data quality checks are time-consuming. We introduced...
Main Authors: | Muhammad A Elmessary, Daniel Thayer, Sarah Rees, Leticia ReesKemp, Arfon Rees |
---|---|
Format: | Article |
Language: | English |
Published: |
Swansea University
2018-09-01
|
Series: | International Journal of Population Data Science |
Online Access: | https://ijpds.org/article/view/1000 |
Similar Items
-
Investigation and reporting of Data Quality within and between linked SAIL datasets
by: Sarah Rees, et al.
Published: (2017-04-01) -
Sensitive Data Flagging within Data Quality reports: R and Regex Integration for Effective Text Flagging in Large Datasets
by: Alex-Ioan Coldea, et al.
Published: (2024-09-01) -
Measuring follow-up time in routinely-collected health datasets: Challenges and solutions.
by: Daniel Thayer, et al.
Published: (2020-01-01) -
Repeatable Research Infrastructure Enabling Administrative Data Analysis
by: Daniel Thayer, et al.
Published: (2019-11-01) -
AD|ARC (Administrative Data | Agricultural Research Collection): Linking individual, household and farm business data for agricultural research – Challenges of linking agricultural datasets with individual-level records
by: Sian Morrison-Rees, et al.
Published: (2023-09-01)