Exploring the use of topological data analysis to automatically detect data quality faults

Data quality problems may occur in various forms in structured and semi-structured data sources. This paper details an unsupervised method of analyzing data quality that is agnostic to the semantics of the data, the format of the encoding, or the internal structure of the dataset. A distance functio...

Full description

Bibliographic Details
Main Author: M. Eduard Tudoreanu
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-12-01
Series:Frontiers in Big Data
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fdata.2022.931398/full