K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks

Currently, a significant focus has been established on the privacy protection of multi-dimensional data publishing in various application scenarios, such as scientific research and policy-making. The K-anonymity mechanism based on clustering is the main method of shared-data desensitization, but it...

Full description

Bibliographic Details
Main Authors:	Bing Su, Jiaxuan Huang, Kelei Miao, Zhangquan Wang, Xudong Zhang, Yourong Chen
Format:	Article
Language:	English
Published:	MDPI AG 2023-01-01
Series:	Sensors
Subjects:	K-anonymity multi-dimensional data skewness attack similarity attack privacy protection
Online Access:	https://www.mdpi.com/1424-8220/23/3/1554

_version_	1797623169407778816
author	Bing Su Jiaxuan Huang Kelei Miao Zhangquan Wang Xudong Zhang Yourong Chen
author_facet	Bing Su Jiaxuan Huang Kelei Miao Zhangquan Wang Xudong Zhang Yourong Chen
author_sort	Bing Su
collection	DOAJ
description	Currently, a significant focus has been established on the privacy protection of multi-dimensional data publishing in various application scenarios, such as scientific research and policy-making. The K-anonymity mechanism based on clustering is the main method of shared-data desensitization, but it will cause problems of inconsistent clustering results and low clustering accuracy. It also cannot defend against several common attacks, such as skewness and similarity attacks at the same time. To defend against these attacks, we propose a K-anonymity privacy protection algorithm for multi-dimensional data against skewness and similarity attacks (KAPP) combined with <i>t</i>-closeness. Firstly, we propose a multi-dimensional sensitive data clustering algorithm based on improved African vultures optimization. More specifically, we improve the initialization, fitness calculation, and solution update strategy of the clustering center. The improved African vultures optimization can provide the optimal solution with various dimensions and achieve highly accurate clustering of the multi-dimensional dataset based on multiple sensitive attributes. It ensures that multi-dimensional data of different clusters are different in sensitive data. After the dataset anonymization, similar sensitive data of the same equivalence class will become less, and it eventually does not satisfy the premise of being theft by skewness and similarity attacks. We also propose an equivalence class partition method based on the sensitive data distribution difference value measurement and <i>t</i>-closeness. Namely, we calculate the sensitive data distribution’s difference value of each equivalence class and then combine the equivalence classes with larger difference values. Each equivalence class satisfies <i>t</i>-closeness. This method can ensure that multi-dimensional data of the same equivalence class are different in multiple sensitive attributes, and thus can effectively defend against skewness and similarity attacks. Moreover, we generalize sensitive attributes with significant weight and all quasi-identifier attributes to achieve anonymous protection of the dataset. The experimental results show that KAPP improves clustering accuracy, diversity, and anonymity compared to other similar methods under skewness and similarity attacks.
first_indexed	2024-03-11T09:24:51Z
format	Article
id	doaj.art-2001c3c10add4eb391527edd054939d4
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T09:24:51Z
publishDate	2023-01-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-2001c3c10add4eb391527edd054939d42023-11-16T18:02:46ZengMDPI AGSensors1424-82202023-01-01233155410.3390/s23031554K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity AttacksBing Su0Jiaxuan Huang1Kelei Miao2Zhangquan Wang3Xudong Zhang4Yourong Chen5School of Computer and Artificial Intelligence, Changzhou University, Changzhou 213164, ChinaSchool of Computer and Artificial Intelligence, Changzhou University, Changzhou 213164, ChinaCollege of Information Science and Technology, Zhejiang Shuren University, Hangzhou 310015, ChinaCollege of Information Science and Technology, Zhejiang Shuren University, Hangzhou 310015, ChinaCollege of Information Science and Technology, Zhejiang Shuren University, Hangzhou 310015, ChinaCollege of Information Science and Technology, Zhejiang Shuren University, Hangzhou 310015, ChinaCurrently, a significant focus has been established on the privacy protection of multi-dimensional data publishing in various application scenarios, such as scientific research and policy-making. The K-anonymity mechanism based on clustering is the main method of shared-data desensitization, but it will cause problems of inconsistent clustering results and low clustering accuracy. It also cannot defend against several common attacks, such as skewness and similarity attacks at the same time. To defend against these attacks, we propose a K-anonymity privacy protection algorithm for multi-dimensional data against skewness and similarity attacks (KAPP) combined with <i>t</i>-closeness. Firstly, we propose a multi-dimensional sensitive data clustering algorithm based on improved African vultures optimization. More specifically, we improve the initialization, fitness calculation, and solution update strategy of the clustering center. The improved African vultures optimization can provide the optimal solution with various dimensions and achieve highly accurate clustering of the multi-dimensional dataset based on multiple sensitive attributes. It ensures that multi-dimensional data of different clusters are different in sensitive data. After the dataset anonymization, similar sensitive data of the same equivalence class will become less, and it eventually does not satisfy the premise of being theft by skewness and similarity attacks. We also propose an equivalence class partition method based on the sensitive data distribution difference value measurement and <i>t</i>-closeness. Namely, we calculate the sensitive data distribution’s difference value of each equivalence class and then combine the equivalence classes with larger difference values. Each equivalence class satisfies <i>t</i>-closeness. This method can ensure that multi-dimensional data of the same equivalence class are different in multiple sensitive attributes, and thus can effectively defend against skewness and similarity attacks. Moreover, we generalize sensitive attributes with significant weight and all quasi-identifier attributes to achieve anonymous protection of the dataset. The experimental results show that KAPP improves clustering accuracy, diversity, and anonymity compared to other similar methods under skewness and similarity attacks.https://www.mdpi.com/1424-8220/23/3/1554K-anonymitymulti-dimensional dataskewness attacksimilarity attackprivacy protection
spellingShingle	Bing Su Jiaxuan Huang Kelei Miao Zhangquan Wang Xudong Zhang Yourong Chen K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks Sensors K-anonymity multi-dimensional data skewness attack similarity attack privacy protection
title	K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_full	K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_fullStr	K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_full_unstemmed	K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_short	K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_sort	k anonymity privacy protection algorithm for multi dimensional data against skewness and similarity attacks
topic	K-anonymity multi-dimensional data skewness attack similarity attack privacy protection
url	https://www.mdpi.com/1424-8220/23/3/1554
work_keys_str_mv	AT bingsu kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks AT jiaxuanhuang kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks AT keleimiao kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks AT zhangquanwang kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks AT xudongzhang kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks AT yourongchen kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks

K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks

Similar Items