Differential Privacy for Data and Model Publishing of Medical Data
Combining medical data and machine learning has fully utilized the value of medical data. However, medical data contain a large amount of sensitive information, and the inappropriate handling of data can lead to the leakage of personal privacy. Thus, both publishing data and training data in machine...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8868084/ |
_version_ | 1818411064086233088 |
---|---|
author | Zongkun Sun Yinglong Wang Minglei Shu Ruixia Liu Huiqi Zhao |
author_facet | Zongkun Sun Yinglong Wang Minglei Shu Ruixia Liu Huiqi Zhao |
author_sort | Zongkun Sun |
collection | DOAJ |
description | Combining medical data and machine learning has fully utilized the value of medical data. However, medical data contain a large amount of sensitive information, and the inappropriate handling of data can lead to the leakage of personal privacy. Thus, both publishing data and training data in machine learning may reveal the privacy of patients. To address the above issue, we propose two effective approaches. One combines a differential privacy and decision tree (DPDT) approach to provide strong privacy guarantees for publishing data, which establishes a weight calculation system based on the classification and regression tree (CART) method and takes weights as a new element of differential privacy to participate in privacy protection and reduce the negative impact of differential privacy on data availability. Another uses the differentially private mini-batch gradient descent algorithm (DPMB) to provide strong protection for training data; it tracks the privacy loss and allows the model to satisfy differential privacy in the process of gradient descent to prevent attackers from invading personal privacy with the training data. It is worth mentioning that, in this paper, we adopt the data processed by DPDT as the training data of DPMB to further strengthen the privacy of data. |
first_indexed | 2024-12-14T10:25:28Z |
format | Article |
id | doaj.art-782d0690b8a34f13ac920f465eedccde |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-14T10:25:28Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-782d0690b8a34f13ac920f465eedccde2022-12-21T23:06:20ZengIEEEIEEE Access2169-35362019-01-01715210315211410.1109/ACCESS.2019.29472958868084Differential Privacy for Data and Model Publishing of Medical DataZongkun Sun0https://orcid.org/0000-0001-6771-7215Yinglong Wang1https://orcid.org/0000-0002-8350-7186Minglei Shu2https://orcid.org/0000-0002-7136-1538Ruixia Liu3Huiqi Zhao4College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, ChinaShandong Computer Science Center (National Supercomputer Center in Jinan), Shandong Provincial Key Laboratory of Computer Networks, Qilu University of Technology (Shandong Academy of Sciences), Jinan, ChinaShandong Computer Science Center (National Supercomputer Center in Jinan), Shandong Provincial Key Laboratory of Computer Networks, Qilu University of Technology (Shandong Academy of Sciences), Jinan, ChinaShandong Computer Science Center (National Supercomputer Center in Jinan), Shandong Provincial Key Laboratory of Computer Networks, Qilu University of Technology (Shandong Academy of Sciences), Jinan, ChinaCollege of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, ChinaCombining medical data and machine learning has fully utilized the value of medical data. However, medical data contain a large amount of sensitive information, and the inappropriate handling of data can lead to the leakage of personal privacy. Thus, both publishing data and training data in machine learning may reveal the privacy of patients. To address the above issue, we propose two effective approaches. One combines a differential privacy and decision tree (DPDT) approach to provide strong privacy guarantees for publishing data, which establishes a weight calculation system based on the classification and regression tree (CART) method and takes weights as a new element of differential privacy to participate in privacy protection and reduce the negative impact of differential privacy on data availability. Another uses the differentially private mini-batch gradient descent algorithm (DPMB) to provide strong protection for training data; it tracks the privacy loss and allows the model to satisfy differential privacy in the process of gradient descent to prevent attackers from invading personal privacy with the training data. It is worth mentioning that, in this paper, we adopt the data processed by DPDT as the training data of DPMB to further strengthen the privacy of data.https://ieeexplore.ieee.org/document/8868084/Deep learningdata privacydifferential privacydata publishing |
spellingShingle | Zongkun Sun Yinglong Wang Minglei Shu Ruixia Liu Huiqi Zhao Differential Privacy for Data and Model Publishing of Medical Data IEEE Access Deep learning data privacy differential privacy data publishing |
title | Differential Privacy for Data and Model Publishing of Medical Data |
title_full | Differential Privacy for Data and Model Publishing of Medical Data |
title_fullStr | Differential Privacy for Data and Model Publishing of Medical Data |
title_full_unstemmed | Differential Privacy for Data and Model Publishing of Medical Data |
title_short | Differential Privacy for Data and Model Publishing of Medical Data |
title_sort | differential privacy for data and model publishing of medical data |
topic | Deep learning data privacy differential privacy data publishing |
url | https://ieeexplore.ieee.org/document/8868084/ |
work_keys_str_mv | AT zongkunsun differentialprivacyfordataandmodelpublishingofmedicaldata AT yinglongwang differentialprivacyfordataandmodelpublishingofmedicaldata AT mingleishu differentialprivacyfordataandmodelpublishingofmedicaldata AT ruixialiu differentialprivacyfordataandmodelpublishingofmedicaldata AT huiqizhao differentialprivacyfordataandmodelpublishingofmedicaldata |