Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection

With the growth of big data and open data in recent years, the importance of data anonymization is increasing. Original data need to be anonymized to prevent personal identification from being revealed before being released to the public. There is a growing variety of de-identification methods which...

Full description

Bibliographic Details
Main Authors: Hsieh-Hong Huang, Jian-Wei Lin, Chia-Hsuan Lin
Format: Article
Language:English
Published: MDPI AG 2019-04-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/11/4/550
Description
Summary:With the growth of big data and open data in recent years, the importance of data anonymization is increasing. Original data need to be anonymized to prevent personal identification from being revealed before being released to the public. There is a growing variety of de-identification methods which have been proposed to reduce the privacy issues, however, there is still much to be improved. The purpose of this study is to demonstrate the possibilities of re-identification from masked data, and to compare the pros and cons of different de-identification methods. A set of electronic toll collection data from Taiwan was used and we successfully re-identified vehicles with specific patterns. Four de-identification methods were performed and finally we compared the strengths and weaknesses of these methods and evaluated their appropriateness.
ISSN:2073-8994