Pricing Personal Data Based on Data Provenance

Data have become an important asset. Mining the value contained in personal data, making personal data an exchangeable commodity, has become a hot spot of industry research. Then, how to price personal data reasonably becomes a problem we have to face. Based on previous research on data provenance,...

Full description

Bibliographic Details
Main Authors: Yuncheng Shen, Bing Guo, Yan Shen, Fan Wu, Hong Zhang, Xuliang Duan, Xiangqian Dong
Format: Article
Language:English
Published: MDPI AG 2019-08-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/9/16/3388
_version_ 1818024268337774592
author Yuncheng Shen
Bing Guo
Yan Shen
Fan Wu
Hong Zhang
Xuliang Duan
Xiangqian Dong
author_facet Yuncheng Shen
Bing Guo
Yan Shen
Fan Wu
Hong Zhang
Xuliang Duan
Xiangqian Dong
author_sort Yuncheng Shen
collection DOAJ
description Data have become an important asset. Mining the value contained in personal data, making personal data an exchangeable commodity, has become a hot spot of industry research. Then, how to price personal data reasonably becomes a problem we have to face. Based on previous research on data provenance, this paper proposes a novel minimum provenance pricing method, which is to price the minimum source tuple set that contributes to the query. Our pricing model first sets prices for source tuples according to their importance and then makes query pricing based on data provenance, which considers both the importance of the data itself and the relationships between the data. We design an exact algorithm that can calculate the exact price of a query in exponential complexity. Furthermore, we design an easy approximate algorithm, which can calculate the approximate price of the query in polynomial time. We instantiated our model with a select-joint query and a complex query and extensively evaluated its performances on two practical datasets. The experimental results show that our pricing model is feasible.
first_indexed 2024-12-10T03:57:31Z
format Article
id doaj.art-031d2d9e1f914e3c81d1f00a55b591eb
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-12-10T03:57:31Z
publishDate 2019-08-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-031d2d9e1f914e3c81d1f00a55b591eb2022-12-22T02:03:03ZengMDPI AGApplied Sciences2076-34172019-08-01916338810.3390/app9163388app9163388Pricing Personal Data Based on Data ProvenanceYuncheng Shen0Bing Guo1Yan Shen2Fan Wu3Hong Zhang4Xuliang Duan5Xiangqian Dong6College of Computer Science, Sichuan University, Chengdu 610065, ChinaCollege of Computer Science, Sichuan University, Chengdu 610065, ChinaSchool of Control Engineering, Chengdu University of Information Technology, Chengdu 610225, ChinaShanghai Key Laboratory of Scalable Computing and Systems, Shanghai Jiao Tong University, Shanghai 200240, ChinaCollege of Computer Science, Sichuan University, Chengdu 610065, ChinaCollege of Computer Science, Sichuan University, Chengdu 610065, ChinaCollege of Computer Science, Sichuan University, Chengdu 610065, ChinaData have become an important asset. Mining the value contained in personal data, making personal data an exchangeable commodity, has become a hot spot of industry research. Then, how to price personal data reasonably becomes a problem we have to face. Based on previous research on data provenance, this paper proposes a novel minimum provenance pricing method, which is to price the minimum source tuple set that contributes to the query. Our pricing model first sets prices for source tuples according to their importance and then makes query pricing based on data provenance, which considers both the importance of the data itself and the relationships between the data. We design an exact algorithm that can calculate the exact price of a query in exponential complexity. Furthermore, we design an easy approximate algorithm, which can calculate the approximate price of the query in polynomial time. We instantiated our model with a select-joint query and a complex query and extensively evaluated its performances on two practical datasets. The experimental results show that our pricing model is feasible.https://www.mdpi.com/2076-3417/9/16/3388personal datadata provenancearbitragedata pricing
spellingShingle Yuncheng Shen
Bing Guo
Yan Shen
Fan Wu
Hong Zhang
Xuliang Duan
Xiangqian Dong
Pricing Personal Data Based on Data Provenance
Applied Sciences
personal data
data provenance
arbitrage
data pricing
title Pricing Personal Data Based on Data Provenance
title_full Pricing Personal Data Based on Data Provenance
title_fullStr Pricing Personal Data Based on Data Provenance
title_full_unstemmed Pricing Personal Data Based on Data Provenance
title_short Pricing Personal Data Based on Data Provenance
title_sort pricing personal data based on data provenance
topic personal data
data provenance
arbitrage
data pricing
url https://www.mdpi.com/2076-3417/9/16/3388
work_keys_str_mv AT yunchengshen pricingpersonaldatabasedondataprovenance
AT bingguo pricingpersonaldatabasedondataprovenance
AT yanshen pricingpersonaldatabasedondataprovenance
AT fanwu pricingpersonaldatabasedondataprovenance
AT hongzhang pricingpersonaldatabasedondataprovenance
AT xuliangduan pricingpersonaldatabasedondataprovenance
AT xiangqiandong pricingpersonaldatabasedondataprovenance