Publishing Anonymized Set-Valued Data via Disassociation towards Analysis

Data publishing is a challenging task for privacy preservation constraints. To ensure privacy, many anonymization techniques have been proposed. They differ in terms of the mathematical properties they verify and in terms of the functional objectives expected. Disassociation is one of the techniques...

Full description

Bibliographic Details
Main Authors:	Nancy Awad, Jean-Francois Couchot, Bechara Al Bouna, Laurent Philippe
Format:	Article
Language:	English
Published:	MDPI AG 2020-04-01
Series:	Future Internet
Subjects:	anonymization knowledge extraction ant colony clustering association rules utility privacy
Online Access:	https://www.mdpi.com/1999-5903/12/4/71

_version_	1797570435978625024
author	Nancy Awad Jean-Francois Couchot Bechara Al Bouna Laurent Philippe
author_facet	Nancy Awad Jean-Francois Couchot Bechara Al Bouna Laurent Philippe
author_sort	Nancy Awad
collection	DOAJ
description	Data publishing is a challenging task for privacy preservation constraints. To ensure privacy, many anonymization techniques have been proposed. They differ in terms of the mathematical properties they verify and in terms of the functional objectives expected. Disassociation is one of the techniques that aim at anonymizing of set-valued datasets (e.g., discrete locations, search and shopping items) while guaranteeing the confidentiality property known as <inline-formula> <math display="inline"> <semantics> <msup> <mi>k</mi> <mi>m</mi> </msup> </semantics> </math> </inline-formula>-anonymity. Disassociation separates the items of an itemset in vertical chunks to create ambiguity in the original associations. In a previous work, we defined a new ant-based clustering algorithm for the disassociation technique to preserve some items associated together, called utility rules, throughout the anonymization process, for accurate analysis. In this paper, we examine the disassociated dataset in terms of knowledge extraction. To make data analysis easy on top of the anonymized dataset, we define neighbor datasets or in other terms datasets that are the result of a probabilistic re-association process. To assess the neighborhood notion set-valued datasets are formalized into trees and a tree edit distance (TED) is directly applied between these neighbors. Finally, we prove the faithfulness of the neighbors to knowledge extraction for future analysis, in the experiments.
first_indexed	2024-03-10T20:24:33Z
format	Article
id	doaj.art-ca8df5621635437da64a4d55694a19f9
institution	Directory Open Access Journal
issn	1999-5903
language	English
last_indexed	2024-03-10T20:24:33Z
publishDate	2020-04-01
publisher	MDPI AG
record_format	Article
series	Future Internet
spelling	doaj.art-ca8df5621635437da64a4d55694a19f92023-11-19T21:55:59ZengMDPI AGFuture Internet1999-59032020-04-011247110.3390/fi12040071Publishing Anonymized Set-Valued Data via Disassociation towards AnalysisNancy Awad0Jean-Francois Couchot1Bechara Al Bouna2Laurent Philippe3Femto-ST Institute, UMR 6174 CNRS, University of Bourgogne-Franche-Comte, 25000 Besançon, FranceFemto-ST Institute, UMR 6174 CNRS, University of Bourgogne-Franche-Comte, 25000 Besançon, FranceTICKET Labortary, Antonine University, Hadat-Baabda 1003, LebanonFemto-ST Institute, UMR 6174 CNRS, University of Bourgogne-Franche-Comte, 25000 Besançon, FranceData publishing is a challenging task for privacy preservation constraints. To ensure privacy, many anonymization techniques have been proposed. They differ in terms of the mathematical properties they verify and in terms of the functional objectives expected. Disassociation is one of the techniques that aim at anonymizing of set-valued datasets (e.g., discrete locations, search and shopping items) while guaranteeing the confidentiality property known as <inline-formula> <math display="inline"> <semantics> <msup> <mi>k</mi> <mi>m</mi> </msup> </semantics> </math> </inline-formula>-anonymity. Disassociation separates the items of an itemset in vertical chunks to create ambiguity in the original associations. In a previous work, we defined a new ant-based clustering algorithm for the disassociation technique to preserve some items associated together, called utility rules, throughout the anonymization process, for accurate analysis. In this paper, we examine the disassociated dataset in terms of knowledge extraction. To make data analysis easy on top of the anonymized dataset, we define neighbor datasets or in other terms datasets that are the result of a probabilistic re-association process. To assess the neighborhood notion set-valued datasets are formalized into trees and a tree edit distance (TED) is directly applied between these neighbors. Finally, we prove the faithfulness of the neighbors to knowledge extraction for future analysis, in the experiments.https://www.mdpi.com/1999-5903/12/4/71anonymizationknowledge extractionant colony clusteringassociation rulesutilityprivacy
spellingShingle	Nancy Awad Jean-Francois Couchot Bechara Al Bouna Laurent Philippe Publishing Anonymized Set-Valued Data via Disassociation towards Analysis Future Internet anonymization knowledge extraction ant colony clustering association rules utility privacy
title	Publishing Anonymized Set-Valued Data via Disassociation towards Analysis
title_full	Publishing Anonymized Set-Valued Data via Disassociation towards Analysis
title_fullStr	Publishing Anonymized Set-Valued Data via Disassociation towards Analysis
title_full_unstemmed	Publishing Anonymized Set-Valued Data via Disassociation towards Analysis
title_short	Publishing Anonymized Set-Valued Data via Disassociation towards Analysis
title_sort	publishing anonymized set valued data via disassociation towards analysis
topic	anonymization knowledge extraction ant colony clustering association rules utility privacy
url	https://www.mdpi.com/1999-5903/12/4/71
work_keys_str_mv	AT nancyawad publishinganonymizedsetvalueddataviadisassociationtowardsanalysis AT jeanfrancoiscouchot publishinganonymizedsetvalueddataviadisassociationtowardsanalysis AT becharaalbouna publishinganonymizedsetvalueddataviadisassociationtowardsanalysis AT laurentphilippe publishinganonymizedsetvalueddataviadisassociationtowardsanalysis

Publishing Anonymized Set-Valued Data via Disassociation towards Analysis

Similar Items