Private Random Variate Sampling for Secure and Federated Polygenic Risk Scores

Polygenic risk scores (PRS) are used to quantify the additive effect of single nucleotide polymorphisms (SNPs) on an individual’s genetic risk for developing a particular trait or condition. Collaborations between data centers are important for improving the statistical power and validity of PRS thr...

Full description

Bibliographic Details
Main Author: Yen, Derek Jia-Wen
Other Authors: Berger, Bonnie
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/153906
Description
Summary:Polygenic risk scores (PRS) are used to quantify the additive effect of single nucleotide polymorphisms (SNPs) on an individual’s genetic risk for developing a particular trait or condition. Collaborations between data centers are important for improving the statistical power and validity of PRS through larger, more genetically diverse datasets. However, owing to the privacy concerns inherent in genomic data, regulations restrict institutions’ capacity to share data. Using cryptography, we present a secure and federated implementation of a Monte Carlo algorithm for PRS, enabling collaborations that respect data regulations. To implement a Monte Carlo algorithm in a privacy-preserving context, our work exhibits techniques for sampling random variates with cryptographically private parameters, which may be of independent interest.