One Size Doesn't Fit All: Measuring Individual Privacy in Aggregate Genomic Data

Even in the aggregate, genomic data can reveal sensitive information about individuals. We present a new model-based measure, PrivMAF, that provides provable privacy guarantees for aggregate data (namely minor allele frequencies) obtained from genomic studies. Unlike many previous measures that have...

Full description

Bibliographic Details
Main Authors: Berger, Bonnie A., Simmons, Sean Kenneth
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Institute of Electrical and Electronics Engineers (IEEE) 2016
Online Access:http://hdl.handle.net/1721.1/105582
https://orcid.org/0000-0002-1537-4000
_version_ 1811089200185868288
author Berger, Bonnie A.
Simmons, Sean Kenneth
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Berger, Bonnie A.
Simmons, Sean Kenneth
author_sort Berger, Bonnie A.
collection MIT
description Even in the aggregate, genomic data can reveal sensitive information about individuals. We present a new model-based measure, PrivMAF, that provides provable privacy guarantees for aggregate data (namely minor allele frequencies) obtained from genomic studies. Unlike many previous measures that have been designed to measure the total privacy lost by all participants in a study, PrivMAF gives an individual privacy measure for each participant in the study, not just an average measure. These individual measures can then be combined to measure the worst case privacy loss in the study. Our measure also allows us to quantify the privacy gains achieved by perturbing the data, either by adding noise or binning. Our findings demonstrate that both perturbation approaches offer significant privacy gains. Moreover, we see that these privacy gains can be achieved while minimizing perturbation (and thus maximizing the utility) relative to stricter notions of privacy, such as differential privacy. We test PrivMAF using genotype data from the Welcome Trust Case Control Consortium, providing a more nuanced understanding of the privacy risks involved in an actual genome-wide association studies. Interestingly, our analysis demonstrates that the privacy implications of releasing MAFs from a study can differ greatly from individual to individual. An implementation of our method is available at http://privmaf.csail.mit.edu.
first_indexed 2024-09-23T14:15:25Z
format Article
id mit-1721.1/105582
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T14:15:25Z
publishDate 2016
publisher Institute of Electrical and Electronics Engineers (IEEE)
record_format dspace
spelling mit-1721.1/1055822022-09-28T19:33:14Z One Size Doesn't Fit All: Measuring Individual Privacy in Aggregate Genomic Data Berger, Bonnie A. Simmons, Sean Kenneth Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Mathematics Berger, Bonnie Berger, Bonnie A. Simmons, Sean Kenneth Even in the aggregate, genomic data can reveal sensitive information about individuals. We present a new model-based measure, PrivMAF, that provides provable privacy guarantees for aggregate data (namely minor allele frequencies) obtained from genomic studies. Unlike many previous measures that have been designed to measure the total privacy lost by all participants in a study, PrivMAF gives an individual privacy measure for each participant in the study, not just an average measure. These individual measures can then be combined to measure the worst case privacy loss in the study. Our measure also allows us to quantify the privacy gains achieved by perturbing the data, either by adding noise or binning. Our findings demonstrate that both perturbation approaches offer significant privacy gains. Moreover, we see that these privacy gains can be achieved while minimizing perturbation (and thus maximizing the utility) relative to stricter notions of privacy, such as differential privacy. We test PrivMAF using genotype data from the Welcome Trust Case Control Consortium, providing a more nuanced understanding of the privacy risks involved in an actual genome-wide association studies. Interestingly, our analysis demonstrates that the privacy implications of releasing MAFs from a study can differ greatly from individual to individual. An implementation of our method is available at http://privmaf.csail.mit.edu. Wellcome Trust (London, England) (Award 076113) 2016-12-05T19:25:41Z 2016-12-05T19:25:41Z 2015-05 Article http://purl.org/eprint/type/ConferencePaper 978-1-4799-9933-0 http://hdl.handle.net/1721.1/105582 Simmons, Sean, and Bonnie Berger. “One Size Doesn’t Fit All: Measuring Individual Privacy in Aggregate Genomic Data.” IEEE, 2015. 41–49. https://orcid.org/0000-0002-1537-4000 en_US http://dx.doi.org/10.1109/SPW.2015.25 2015 IEEE Security and Privacy Workshops Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) Prof. Berger via Michael Noga
spellingShingle Berger, Bonnie A.
Simmons, Sean Kenneth
One Size Doesn't Fit All: Measuring Individual Privacy in Aggregate Genomic Data
title One Size Doesn't Fit All: Measuring Individual Privacy in Aggregate Genomic Data
title_full One Size Doesn't Fit All: Measuring Individual Privacy in Aggregate Genomic Data
title_fullStr One Size Doesn't Fit All: Measuring Individual Privacy in Aggregate Genomic Data
title_full_unstemmed One Size Doesn't Fit All: Measuring Individual Privacy in Aggregate Genomic Data
title_short One Size Doesn't Fit All: Measuring Individual Privacy in Aggregate Genomic Data
title_sort one size doesn t fit all measuring individual privacy in aggregate genomic data
url http://hdl.handle.net/1721.1/105582
https://orcid.org/0000-0002-1537-4000
work_keys_str_mv AT bergerbonniea onesizedoesntfitallmeasuringindividualprivacyinaggregategenomicdata
AT simmonsseankenneth onesizedoesntfitallmeasuringindividualprivacyinaggregategenomicdata