Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation

The collection of personal data is exponentially growing and, as a result, individual privacy is endangered accordingly. With the aim to lessen privacy risks whilst maintaining high degrees of data utility, a variety of techniques have been proposed, being microaggregation a very popular one. Microa...

Full description

Bibliographic Details
Main Authors: Armando Maya-López, Fran Casino, Agusti Solanas
Format: Article
Language:English
Published: MDPI AG 2021-05-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/13/6/916
_version_ 1797533268835303424
author Armando Maya-López
Fran Casino
Agusti Solanas
author_facet Armando Maya-López
Fran Casino
Agusti Solanas
author_sort Armando Maya-López
collection DOAJ
description The collection of personal data is exponentially growing and, as a result, individual privacy is endangered accordingly. With the aim to lessen privacy risks whilst maintaining high degrees of data utility, a variety of techniques have been proposed, being microaggregation a very popular one. Microaggregation is a family of perturbation methods, in which its principle is to aggregate personal data records (i.e., microdata) in groups so as to preserve privacy through <i>k</i>-anonymity. The multivariate microaggregation problem is known to be NP-Hard; however, its univariate version could be optimally solved in polynomial time using the Hansen-Mukherjee (HM) algorithm. In this article, we propose a heuristic solution to the multivariate microaggregation problem inspired by the Traveling Salesman Problem (TSP) and the optimal univariate microaggregation solution. Given a multivariate dataset, first, we apply a TSP-tour construction heuristic to generate a Hamiltonian path through all dataset records. Next, we use the order provided by this Hamiltonian path (i.e., a given permutation of the records) as input to the Hansen-Mukherjee algorithm, virtually transforming it into a multivariate microaggregation solver we call Multivariate Hansen-Mukherjee (MHM). Our intuition is that good solutions to the TSP would yield Hamiltonian paths allowing the Hansen-Mukherjee algorithm to find good solutions to the multivariate microaggregation problem. We have tested our method with well-known benchmark datasets. Moreover, with the aim to show the usefulness of our approach to protecting location privacy, we have tested our solution with real-life trajectories datasets, too. We have compared the results of our algorithm with those of the best performing solutions, and we show that our proposal reduces the information loss resulting from the microaggregation. Overall, results suggest that transforming the multivariate microaggregation problem into its univariate counterpart by ordering microdata records with a proper Hamiltonian path and applying an optimal univariate solution leads to a reduction of the perturbation error whilst keeping the same privacy guarantees.
first_indexed 2024-03-10T11:12:08Z
format Article
id doaj.art-5280732a3c304526b536205baa6053a8
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-03-10T11:12:08Z
publishDate 2021-05-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-5280732a3c304526b536205baa6053a82023-11-21T20:41:23ZengMDPI AGSymmetry2073-89942021-05-0113691610.3390/sym13060916Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate MicroaggregationArmando Maya-López0Fran Casino1Agusti Solanas2Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007 Tarragona, Catalonia, SpainDepartment of Informatics, University of Piraeus, Karaoli kai dimitriou 80, 18534 Pireas, GreeceDepartment of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007 Tarragona, Catalonia, SpainThe collection of personal data is exponentially growing and, as a result, individual privacy is endangered accordingly. With the aim to lessen privacy risks whilst maintaining high degrees of data utility, a variety of techniques have been proposed, being microaggregation a very popular one. Microaggregation is a family of perturbation methods, in which its principle is to aggregate personal data records (i.e., microdata) in groups so as to preserve privacy through <i>k</i>-anonymity. The multivariate microaggregation problem is known to be NP-Hard; however, its univariate version could be optimally solved in polynomial time using the Hansen-Mukherjee (HM) algorithm. In this article, we propose a heuristic solution to the multivariate microaggregation problem inspired by the Traveling Salesman Problem (TSP) and the optimal univariate microaggregation solution. Given a multivariate dataset, first, we apply a TSP-tour construction heuristic to generate a Hamiltonian path through all dataset records. Next, we use the order provided by this Hamiltonian path (i.e., a given permutation of the records) as input to the Hansen-Mukherjee algorithm, virtually transforming it into a multivariate microaggregation solver we call Multivariate Hansen-Mukherjee (MHM). Our intuition is that good solutions to the TSP would yield Hamiltonian paths allowing the Hansen-Mukherjee algorithm to find good solutions to the multivariate microaggregation problem. We have tested our method with well-known benchmark datasets. Moreover, with the aim to show the usefulness of our approach to protecting location privacy, we have tested our solution with real-life trajectories datasets, too. We have compared the results of our algorithm with those of the best performing solutions, and we show that our proposal reduces the information loss resulting from the microaggregation. Overall, results suggest that transforming the multivariate microaggregation problem into its univariate counterpart by ordering microdata records with a proper Hamiltonian path and applying an optimal univariate solution leads to a reduction of the perturbation error whilst keeping the same privacy guarantees.https://www.mdpi.com/2073-8994/13/6/916microaggregationstatistical disclosure controlgraph theorytraveling salesman problemdata privacylocation privacy
spellingShingle Armando Maya-López
Fran Casino
Agusti Solanas
Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation
Symmetry
microaggregation
statistical disclosure control
graph theory
traveling salesman problem
data privacy
location privacy
title Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation
title_full Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation
title_fullStr Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation
title_full_unstemmed Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation
title_short Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation
title_sort improving multivariate microaggregation through hamiltonian paths and optimal univariate microaggregation
topic microaggregation
statistical disclosure control
graph theory
traveling salesman problem
data privacy
location privacy
url https://www.mdpi.com/2073-8994/13/6/916
work_keys_str_mv AT armandomayalopez improvingmultivariatemicroaggregationthroughhamiltonianpathsandoptimalunivariatemicroaggregation
AT francasino improvingmultivariatemicroaggregationthroughhamiltonianpathsandoptimalunivariatemicroaggregation
AT agustisolanas improvingmultivariatemicroaggregationthroughhamiltonianpathsandoptimalunivariatemicroaggregation