Differential Privacy for Data and Model Publishing of Medical Data

Combining medical data and machine learning has fully utilized the value of medical data. However, medical data contain a large amount of sensitive information, and the inappropriate handling of data can lead to the leakage of personal privacy. Thus, both publishing data and training data in machine...

Full description

Bibliographic Details
Main Authors: Zongkun Sun, Yinglong Wang, Minglei Shu, Ruixia Liu, Huiqi Zhao
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8868084/
_version_ 1818411064086233088
author Zongkun Sun
Yinglong Wang
Minglei Shu
Ruixia Liu
Huiqi Zhao
author_facet Zongkun Sun
Yinglong Wang
Minglei Shu
Ruixia Liu
Huiqi Zhao
author_sort Zongkun Sun
collection DOAJ
description Combining medical data and machine learning has fully utilized the value of medical data. However, medical data contain a large amount of sensitive information, and the inappropriate handling of data can lead to the leakage of personal privacy. Thus, both publishing data and training data in machine learning may reveal the privacy of patients. To address the above issue, we propose two effective approaches. One combines a differential privacy and decision tree (DPDT) approach to provide strong privacy guarantees for publishing data, which establishes a weight calculation system based on the classification and regression tree (CART) method and takes weights as a new element of differential privacy to participate in privacy protection and reduce the negative impact of differential privacy on data availability. Another uses the differentially private mini-batch gradient descent algorithm (DPMB) to provide strong protection for training data; it tracks the privacy loss and allows the model to satisfy differential privacy in the process of gradient descent to prevent attackers from invading personal privacy with the training data. It is worth mentioning that, in this paper, we adopt the data processed by DPDT as the training data of DPMB to further strengthen the privacy of data.
first_indexed 2024-12-14T10:25:28Z
format Article
id doaj.art-782d0690b8a34f13ac920f465eedccde
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T10:25:28Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-782d0690b8a34f13ac920f465eedccde2022-12-21T23:06:20ZengIEEEIEEE Access2169-35362019-01-01715210315211410.1109/ACCESS.2019.29472958868084Differential Privacy for Data and Model Publishing of Medical DataZongkun Sun0https://orcid.org/0000-0001-6771-7215Yinglong Wang1https://orcid.org/0000-0002-8350-7186Minglei Shu2https://orcid.org/0000-0002-7136-1538Ruixia Liu3Huiqi Zhao4College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, ChinaShandong Computer Science Center (National Supercomputer Center in Jinan), Shandong Provincial Key Laboratory of Computer Networks, Qilu University of Technology (Shandong Academy of Sciences), Jinan, ChinaShandong Computer Science Center (National Supercomputer Center in Jinan), Shandong Provincial Key Laboratory of Computer Networks, Qilu University of Technology (Shandong Academy of Sciences), Jinan, ChinaShandong Computer Science Center (National Supercomputer Center in Jinan), Shandong Provincial Key Laboratory of Computer Networks, Qilu University of Technology (Shandong Academy of Sciences), Jinan, ChinaCollege of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, ChinaCombining medical data and machine learning has fully utilized the value of medical data. However, medical data contain a large amount of sensitive information, and the inappropriate handling of data can lead to the leakage of personal privacy. Thus, both publishing data and training data in machine learning may reveal the privacy of patients. To address the above issue, we propose two effective approaches. One combines a differential privacy and decision tree (DPDT) approach to provide strong privacy guarantees for publishing data, which establishes a weight calculation system based on the classification and regression tree (CART) method and takes weights as a new element of differential privacy to participate in privacy protection and reduce the negative impact of differential privacy on data availability. Another uses the differentially private mini-batch gradient descent algorithm (DPMB) to provide strong protection for training data; it tracks the privacy loss and allows the model to satisfy differential privacy in the process of gradient descent to prevent attackers from invading personal privacy with the training data. It is worth mentioning that, in this paper, we adopt the data processed by DPDT as the training data of DPMB to further strengthen the privacy of data.https://ieeexplore.ieee.org/document/8868084/Deep learningdata privacydifferential privacydata publishing
spellingShingle Zongkun Sun
Yinglong Wang
Minglei Shu
Ruixia Liu
Huiqi Zhao
Differential Privacy for Data and Model Publishing of Medical Data
IEEE Access
Deep learning
data privacy
differential privacy
data publishing
title Differential Privacy for Data and Model Publishing of Medical Data
title_full Differential Privacy for Data and Model Publishing of Medical Data
title_fullStr Differential Privacy for Data and Model Publishing of Medical Data
title_full_unstemmed Differential Privacy for Data and Model Publishing of Medical Data
title_short Differential Privacy for Data and Model Publishing of Medical Data
title_sort differential privacy for data and model publishing of medical data
topic Deep learning
data privacy
differential privacy
data publishing
url https://ieeexplore.ieee.org/document/8868084/
work_keys_str_mv AT zongkunsun differentialprivacyfordataandmodelpublishingofmedicaldata
AT yinglongwang differentialprivacyfordataandmodelpublishingofmedicaldata
AT mingleishu differentialprivacyfordataandmodelpublishingofmedicaldata
AT ruixialiu differentialprivacyfordataandmodelpublishingofmedicaldata
AT huiqizhao differentialprivacyfordataandmodelpublishingofmedicaldata