An open dataset of data lineage graphs for data governance research
Data have become valuable assets for enterprises. Data governance aims to manage and reuse data assets, facilitating enterprise management and enabling product innovations. A data lineage graph (DLG) is an abstracted collection of data assets and their data lineages in data governance. Analyzing DLG...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024-03-01
|
Series: | Visual Informatics |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2468502X24000020 |
_version_ | 1797234356739112960 |
---|---|
author | Yunpeng Chen Ying Zhao Xuanjing Li Jiang Zhang Jiang Long Fangfang Zhou |
author_facet | Yunpeng Chen Ying Zhao Xuanjing Li Jiang Zhang Jiang Long Fangfang Zhou |
author_sort | Yunpeng Chen |
collection | DOAJ |
description | Data have become valuable assets for enterprises. Data governance aims to manage and reuse data assets, facilitating enterprise management and enabling product innovations. A data lineage graph (DLG) is an abstracted collection of data assets and their data lineages in data governance. Analyzing DLGs can provide rich data insights for data governance. However, the progress of data governance technologies is hindered by the shortage of available open datasets for DLGs. This paper introduces an open dataset of DLGs, including the DLG model, the dataset construction process, and applied areas. This real-world dataset is sourced from Huawei Cloud Computing Technology Company Limited, which contains 18 DLGs with three types of data assets and two types of relations. To the best of our knowledge, this dataset is the first open dataset of DLGs for data governance. This dataset can also support the development of other application areas, such as graph analytics and visualization. |
first_indexed | 2024-04-24T16:30:46Z |
format | Article |
id | doaj.art-b214f3b39b5b48dc8caa500b0bdbd341 |
institution | Directory Open Access Journal |
issn | 2468-502X |
language | English |
last_indexed | 2024-04-24T16:30:46Z |
publishDate | 2024-03-01 |
publisher | Elsevier |
record_format | Article |
series | Visual Informatics |
spelling | doaj.art-b214f3b39b5b48dc8caa500b0bdbd3412024-03-30T04:39:47ZengElsevierVisual Informatics2468-502X2024-03-018115An open dataset of data lineage graphs for data governance researchYunpeng Chen0Ying Zhao1Xuanjing Li2Jiang Zhang3Jiang Long4Fangfang Zhou5Central South University, Changsha, 410083, Hunan, ChinaCentral South University, Changsha, 410083, Hunan, ChinaCentral South University, Changsha, 410083, Hunan, ChinaHuawei Cloud Computing Technology Co., Ltd., Hangzhou, 310000, Zhejiang, ChinaHuawei Cloud Computing Technology Co., Ltd., Hangzhou, 310000, Zhejiang, ChinaCentral South University, Changsha, 410083, Hunan, China; Corresponding author.Data have become valuable assets for enterprises. Data governance aims to manage and reuse data assets, facilitating enterprise management and enabling product innovations. A data lineage graph (DLG) is an abstracted collection of data assets and their data lineages in data governance. Analyzing DLGs can provide rich data insights for data governance. However, the progress of data governance technologies is hindered by the shortage of available open datasets for DLGs. This paper introduces an open dataset of DLGs, including the DLG model, the dataset construction process, and applied areas. This real-world dataset is sourced from Huawei Cloud Computing Technology Company Limited, which contains 18 DLGs with three types of data assets and two types of relations. To the best of our knowledge, this dataset is the first open dataset of DLGs for data governance. This dataset can also support the development of other application areas, such as graph analytics and visualization.http://www.sciencedirect.com/science/article/pii/S2468502X24000020Data assetData governanceData lineageGraphOpen dataset |
spellingShingle | Yunpeng Chen Ying Zhao Xuanjing Li Jiang Zhang Jiang Long Fangfang Zhou An open dataset of data lineage graphs for data governance research Visual Informatics Data asset Data governance Data lineage Graph Open dataset |
title | An open dataset of data lineage graphs for data governance research |
title_full | An open dataset of data lineage graphs for data governance research |
title_fullStr | An open dataset of data lineage graphs for data governance research |
title_full_unstemmed | An open dataset of data lineage graphs for data governance research |
title_short | An open dataset of data lineage graphs for data governance research |
title_sort | open dataset of data lineage graphs for data governance research |
topic | Data asset Data governance Data lineage Graph Open dataset |
url | http://www.sciencedirect.com/science/article/pii/S2468502X24000020 |
work_keys_str_mv | AT yunpengchen anopendatasetofdatalineagegraphsfordatagovernanceresearch AT yingzhao anopendatasetofdatalineagegraphsfordatagovernanceresearch AT xuanjingli anopendatasetofdatalineagegraphsfordatagovernanceresearch AT jiangzhang anopendatasetofdatalineagegraphsfordatagovernanceresearch AT jianglong anopendatasetofdatalineagegraphsfordatagovernanceresearch AT fangfangzhou anopendatasetofdatalineagegraphsfordatagovernanceresearch AT yunpengchen opendatasetofdatalineagegraphsfordatagovernanceresearch AT yingzhao opendatasetofdatalineagegraphsfordatagovernanceresearch AT xuanjingli opendatasetofdatalineagegraphsfordatagovernanceresearch AT jiangzhang opendatasetofdatalineagegraphsfordatagovernanceresearch AT jianglong opendatasetofdatalineagegraphsfordatagovernanceresearch AT fangfangzhou opendatasetofdatalineagegraphsfordatagovernanceresearch |