Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection

With the growth of big data and open data in recent years, the importance of data anonymization is increasing. Original data need to be anonymized to prevent personal identification from being revealed before being released to the public. There is a growing variety of de-identification methods which...

Full description

Bibliographic Details
Main Authors: Hsieh-Hong Huang, Jian-Wei Lin, Chia-Hsuan Lin
Format: Article
Language:English
Published: MDPI AG 2019-04-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/11/4/550
_version_ 1798026789070569472
author Hsieh-Hong Huang
Jian-Wei Lin
Chia-Hsuan Lin
author_facet Hsieh-Hong Huang
Jian-Wei Lin
Chia-Hsuan Lin
author_sort Hsieh-Hong Huang
collection DOAJ
description With the growth of big data and open data in recent years, the importance of data anonymization is increasing. Original data need to be anonymized to prevent personal identification from being revealed before being released to the public. There is a growing variety of de-identification methods which have been proposed to reduce the privacy issues, however, there is still much to be improved. The purpose of this study is to demonstrate the possibilities of re-identification from masked data, and to compare the pros and cons of different de-identification methods. A set of electronic toll collection data from Taiwan was used and we successfully re-identified vehicles with specific patterns. Four de-identification methods were performed and finally we compared the strengths and weaknesses of these methods and evaluated their appropriateness.
first_indexed 2024-04-11T18:40:59Z
format Article
id doaj.art-a90afeb157aa488d91f7782486d81c02
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-04-11T18:40:59Z
publishDate 2019-04-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-a90afeb157aa488d91f7782486d81c022022-12-22T04:08:58ZengMDPI AGSymmetry2073-89942019-04-0111455010.3390/sym11040550sym11040550Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll CollectionHsieh-Hong Huang0Jian-Wei Lin1Chia-Hsuan Lin2Department of Information Science and Management Systems, National Taitung University, Taitung 95092, TaiwanDepartment of International Business, Chien Hsin University of Science and Technology, Taoyuan 32097, TaiwanDepartment of Information Science and Management Systems, National Taitung University, Taitung 95092, TaiwanWith the growth of big data and open data in recent years, the importance of data anonymization is increasing. Original data need to be anonymized to prevent personal identification from being revealed before being released to the public. There is a growing variety of de-identification methods which have been proposed to reduce the privacy issues, however, there is still much to be improved. The purpose of this study is to demonstrate the possibilities of re-identification from masked data, and to compare the pros and cons of different de-identification methods. A set of electronic toll collection data from Taiwan was used and we successfully re-identified vehicles with specific patterns. Four de-identification methods were performed and finally we compared the strengths and weaknesses of these methods and evaluated their appropriateness.https://www.mdpi.com/2073-8994/11/4/550data anonymityopen datade-identificationre-identificationelectronic toll collection
spellingShingle Hsieh-Hong Huang
Jian-Wei Lin
Chia-Hsuan Lin
Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection
Symmetry
data anonymity
open data
de-identification
re-identification
electronic toll collection
title Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection
title_full Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection
title_fullStr Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection
title_full_unstemmed Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection
title_short Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection
title_sort data re identification a case of retrieving masked data from electronic toll collection
topic data anonymity
open data
de-identification
re-identification
electronic toll collection
url https://www.mdpi.com/2073-8994/11/4/550
work_keys_str_mv AT hsiehhonghuang datareidentificationacaseofretrievingmaskeddatafromelectronictollcollection
AT jianweilin datareidentificationacaseofretrievingmaskeddatafromelectronictollcollection
AT chiahsuanlin datareidentificationacaseofretrievingmaskeddatafromelectronictollcollection