Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook

With the continuous increase in avenues of personal data generation, privacy protection has become a hot research topic resulting in various proposed mechanisms to address this social issue. The main technical solutions for guaranteeing a user’s privacy are encryption, pseudonymization, a...

Full description

Bibliographic Details
Main Authors:	Abdul Majeed, Safiullah Khan, Seong Oun Hwang
Format:	Article
Language:	English
Published:	IEEE 2022-01-01
Series:	IEEE Access
Subjects:	Privacy utility anonymization personal data clustering social networks
Online Access:	https://ieeexplore.ieee.org/document/9775092/

_version_	1818211203219980288
author	Abdul Majeed Safiullah Khan Seong Oun Hwang
author_facet	Abdul Majeed Safiullah Khan Seong Oun Hwang
author_sort	Abdul Majeed
collection	DOAJ
description	With the continuous increase in avenues of personal data generation, privacy protection has become a hot research topic resulting in various proposed mechanisms to address this social issue. The main technical solutions for guaranteeing a user’s privacy are encryption, pseudonymization, anonymization, differential privacy (DP), and obfuscation. Despite the success of other solutions, anonymization has been widely used in commercial settings for privacy preservation because of its algorithmic simplicity and low computing overhead. It facilitates unconstrained analysis of published data that DP and the other latest techniques cannot offer, and it is a mainstream solution for responsible data science. In this paper, we present a comprehensive analysis of clustering-based anonymization mechanisms (CAMs) that have been recently proposed to preserve both privacy and utility in data publishing. We systematically categorize the existing CAMs based on heterogeneous types of data (tables, graphs, matrixes, etc.), and we present an up-to-date, extensive review of existing CAMs and the metrics used for their evaluation. We discuss the superiority and effectiveness of CAMs over traditional anonymization mechanisms. We highlight the significance of CAMs in different computing paradigms, such as social networks, the internet of things, cloud computing, AI, and location-based systems with regard to privacy preservation. Furthermore, we present various proposed representative CAMs that compromise individual privacy, rather than safeguarding it. Besides, this article provides an extended knowledge (e.g., key assertion(s), strengths, weaknesses, clustering methods used in the anonymization process, and %age improvements in quantitative results) about each technique that provides a clear view of how much this topic has been investigated thus far, and what are the research gaps that seek pertinent solutions in the near future. Finally, we discuss the technical challenges of applying CAMs, and we suggest promising opportunities for future research. To the best of our knowledge, this is the first work to systematically cover current CAMs involving different data types and computing paradigms.
first_indexed	2024-12-12T05:28:46Z
format	Article
id	doaj.art-7051dbd2db8a4cc38135ca26a1efb3d9
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-12T05:28:46Z
publishDate	2022-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-7051dbd2db8a4cc38135ca26a1efb3d92022-12-22T00:36:23ZengIEEEIEEE Access2169-35362022-01-0110530665309710.1109/ACCESS.2022.31752199775092Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research OutlookAbdul Majeed0https://orcid.org/0000-0002-3030-5054Safiullah Khan1https://orcid.org/0000-0001-8342-6928Seong Oun Hwang2https://orcid.org/0000-0003-4240-6255Department of Computer Engineering, Gachon University, Seongnam-si, Republic of KoreaDepartment of IT Convergence Engineering, Gachon University, Seongnam-si, Republic of KoreaDepartment of Computer Engineering, Gachon University, Seongnam-si, Republic of KoreaWith the continuous increase in avenues of personal data generation, privacy protection has become a hot research topic resulting in various proposed mechanisms to address this social issue. The main technical solutions for guaranteeing a user’s privacy are encryption, pseudonymization, anonymization, differential privacy (DP), and obfuscation. Despite the success of other solutions, anonymization has been widely used in commercial settings for privacy preservation because of its algorithmic simplicity and low computing overhead. It facilitates unconstrained analysis of published data that DP and the other latest techniques cannot offer, and it is a mainstream solution for responsible data science. In this paper, we present a comprehensive analysis of clustering-based anonymization mechanisms (CAMs) that have been recently proposed to preserve both privacy and utility in data publishing. We systematically categorize the existing CAMs based on heterogeneous types of data (tables, graphs, matrixes, etc.), and we present an up-to-date, extensive review of existing CAMs and the metrics used for their evaluation. We discuss the superiority and effectiveness of CAMs over traditional anonymization mechanisms. We highlight the significance of CAMs in different computing paradigms, such as social networks, the internet of things, cloud computing, AI, and location-based systems with regard to privacy preservation. Furthermore, we present various proposed representative CAMs that compromise individual privacy, rather than safeguarding it. Besides, this article provides an extended knowledge (e.g., key assertion(s), strengths, weaknesses, clustering methods used in the anonymization process, and %age improvements in quantitative results) about each technique that provides a clear view of how much this topic has been investigated thus far, and what are the research gaps that seek pertinent solutions in the near future. Finally, we discuss the technical challenges of applying CAMs, and we suggest promising opportunities for future research. To the best of our knowledge, this is the first work to systematically cover current CAMs involving different data types and computing paradigms.https://ieeexplore.ieee.org/document/9775092/Privacyutilityanonymizationpersonal dataclusteringsocial networks
spellingShingle	Abdul Majeed Safiullah Khan Seong Oun Hwang Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook IEEE Access Privacy utility anonymization personal data clustering social networks
title	Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook
title_full	Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook
title_fullStr	Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook
title_full_unstemmed	Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook
title_short	Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook
title_sort	toward privacy preservation using clustering based anonymization recent advances and future research outlook
topic	Privacy utility anonymization personal data clustering social networks
url	https://ieeexplore.ieee.org/document/9775092/
work_keys_str_mv	AT abdulmajeed towardprivacypreservationusingclusteringbasedanonymizationrecentadvancesandfutureresearchoutlook AT safiullahkhan towardprivacypreservationusingclusteringbasedanonymizationrecentadvancesandfutureresearchoutlook AT seongounhwang towardprivacypreservationusingclusteringbasedanonymizationrecentadvancesandfutureresearchoutlook

Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook

Similar Items