Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy

Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published...

Full description

Bibliographic Details
Main Authors: Yuan, Ganzhao, Zhang, Zhenjie, Winslett, Marianne, Xiao, Xiaokui, Yang, Yin, Hao, Zhifeng
Other Authors: School of Computer Science and Engineering
Format: Journal Article
Language:English
Published: 2017
Subjects:
Online Access:https://hdl.handle.net/10356/81387
http://hdl.handle.net/10220/43472
_version_ 1811691736444960768
author Yuan, Ganzhao
Zhang, Zhenjie
Winslett, Marianne
Xiao, Xiaokui
Yang, Yin
Hao, Zhifeng
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Yuan, Ganzhao
Zhang, Zhenjie
Winslett, Marianne
Xiao, Xiaokui
Yang, Yin
Hao, Zhifeng
author_sort Yuan, Ganzhao
collection NTU
description Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.
first_indexed 2024-10-01T06:24:38Z
format Journal Article
id ntu-10356/81387
institution Nanyang Technological University
language English
last_indexed 2024-10-01T06:24:38Z
publishDate 2017
record_format dspace
spelling ntu-10356/813872020-03-07T11:48:54Z Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy Yuan, Ganzhao Zhang, Zhenjie Winslett, Marianne Xiao, Xiaokui Yang, Yin Hao, Zhifeng School of Computer Science and Engineering Linear counting query Differential privacy Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins. MOE (Min. of Education, S’pore) Accepted version 2017-07-27T08:09:21Z 2019-12-06T14:29:48Z 2017-07-27T08:09:21Z 2019-12-06T14:29:48Z 2015 Journal Article Yuan, G., Zhang, Z., Winslett, M., Xiao, X., Yang, Y., & Hao, Z. (2015). Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy. ACM Transactions on Database Systems, 40(2), 11-. 0362-5915 https://hdl.handle.net/10356/81387 http://hdl.handle.net/10220/43472 10.1145/2699501 en ACM Transactions on Database Systems © 2015 ACM. This is the author created version of a work that has been peer reviewed and accepted for publication by ACM Transactions on Database Systems, ACM. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1145/2699501]. 45 p. application/pdf
spellingShingle Linear counting query
Differential privacy
Yuan, Ganzhao
Zhang, Zhenjie
Winslett, Marianne
Xiao, Xiaokui
Yang, Yin
Hao, Zhifeng
Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
title Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
title_full Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
title_fullStr Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
title_full_unstemmed Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
title_short Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
title_sort optimizing batch linear queries under exact and approximate differential privacy
topic Linear counting query
Differential privacy
url https://hdl.handle.net/10356/81387
http://hdl.handle.net/10220/43472
work_keys_str_mv AT yuanganzhao optimizingbatchlinearqueriesunderexactandapproximatedifferentialprivacy
AT zhangzhenjie optimizingbatchlinearqueriesunderexactandapproximatedifferentialprivacy
AT winslettmarianne optimizingbatchlinearqueriesunderexactandapproximatedifferentialprivacy
AT xiaoxiaokui optimizingbatchlinearqueriesunderexactandapproximatedifferentialprivacy
AT yangyin optimizingbatchlinearqueriesunderexactandapproximatedifferentialprivacy
AT haozhifeng optimizingbatchlinearqueriesunderexactandapproximatedifferentialprivacy