Speeding up privacy preserving data mining techniques

Privacy-Preserving Data Mining (PPDM) allows one to discover hidden patterns from many sources of databases while maintaining the privacy of data. Since its inception in two pioneering work by Agrawal [AS00] and Lindell [LP00], PPDM has attracted much attention from the research community. There hav...

Full description

Bibliographic Details
Main Author: Tran, Huy Duc
Other Authors: Ng Wee Keong
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:https://hdl.handle.net/10356/68814
Description
Summary:Privacy-Preserving Data Mining (PPDM) allows one to discover hidden patterns from many sources of databases while maintaining the privacy of data. Since its inception in two pioneering work by Agrawal [AS00] and Lindell [LP00], PPDM has attracted much attention from the research community. There have been a variety of secure protocols from association rule mining to classification to clustering. There are two major approaches in PPDM: randomization and secure multi-party computation. The former is based on statistical properties to add noise to the original values to hide sensitive data. The latter makes use of encryption techniques to prevent adversaries from seeing original data. Our proposed methods in this thesis follow the second approach. We first introduce an efficient privacy-preserving protocol to compute scalar product for multiple parties called CSSP. The protocol is designed using caching techniques thanks to homomorphic multiplicative cryptosystems. When applying to association rule mining problems, CSSP outperforms existing work in term of running time while maintaining the same level of security. Since data is always updated, there is a need for protocols to adapt with the changes. With this purpose, we propose an incremental privacy preserving data mining protocol for association rule mining that allows parties to perform mining tasks on updated data instead of entire data. The protocol, called INCRE, scans old databases at most once, and therefore reducing computation overheads. We also conduct experiments to show the efficiency of the protocol over the existing methods. With the rapid development of cloud computing, there is a need to store and share data between users of the cloud storage to perform data mining processes. We design a new framework to help users of the cloud storage not only share their data with targeted parties but also be able to revoke their access when required. The framework exploits the properties of proxy re-encryption schemes. Every user in the group has his own secret key to encrypt and decrypt data. The key will be revoked if the user leaves the group. Using proxy re-encryption schemes, the framework helps any user be able to access others' data in the same group.