Privately computing set-maximal matches in genomic data

Background: Finding long matches in deoxyribonucleic acid (DNA) sequences in large aligned genetic sequences is a problem of great interest. A paradigmatic application is the identification of distant relatives via large common subsequences in DNA data. However, because of the sensitive nature of ge...

Full description

Bibliographic Details
Main Authors: Sotiraki, Katerina, Ghosh, Esha, Chen, Hao
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: Springer Science and Business Media LLC 2020
Online Access:https://hdl.handle.net/1721.1/127190
_version_ 1811097932424806400
author Sotiraki, Katerina
Ghosh, Esha
Chen, Hao
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Sotiraki, Katerina
Ghosh, Esha
Chen, Hao
author_sort Sotiraki, Katerina
collection MIT
description Background: Finding long matches in deoxyribonucleic acid (DNA) sequences in large aligned genetic sequences is a problem of great interest. A paradigmatic application is the identification of distant relatives via large common subsequences in DNA data. However, because of the sensitive nature of genomic data such computations without security consideration might compromise the privacy of the individuals involved. Methods: The secret sharing technique enables the computation of matches while respecting the privacy of the inputs of the parties involved. This method requires interaction that depends on the circuit depth needed for the computation. Results: We design a new depth-optimized algorithm for computing set-maximal matches between a database of aligned genetic sequences and the DNA of an individual while respecting the privacy of both the database owner and the individual. We then implement and evaluate our protocol. Conclusions: Using modern cryptographic techniques, difficult genomic computations are performed in a privacy-preserving way. We enrich this research area by proposing a privacy-preserving protocol for set-maximal matches.
first_indexed 2024-09-23T17:07:17Z
format Article
id mit-1721.1/127190
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T17:07:17Z
publishDate 2020
publisher Springer Science and Business Media LLC
record_format dspace
spelling mit-1721.1/1271902022-10-03T10:35:39Z Privately computing set-maximal matches in genomic data Sotiraki, Katerina Ghosh, Esha Chen, Hao Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Background: Finding long matches in deoxyribonucleic acid (DNA) sequences in large aligned genetic sequences is a problem of great interest. A paradigmatic application is the identification of distant relatives via large common subsequences in DNA data. However, because of the sensitive nature of genomic data such computations without security consideration might compromise the privacy of the individuals involved. Methods: The secret sharing technique enables the computation of matches while respecting the privacy of the inputs of the parties involved. This method requires interaction that depends on the circuit depth needed for the computation. Results: We design a new depth-optimized algorithm for computing set-maximal matches between a database of aligned genetic sequences and the DNA of an individual while respecting the privacy of both the database owner and the individual. We then implement and evaluate our protocol. Conclusions: Using modern cryptographic techniques, difficult genomic computations are performed in a privacy-preserving way. We enrich this research area by proposing a privacy-preserving protocol for set-maximal matches. 2020-09-04T21:10:13Z 2020-09-04T21:10:13Z 2020-07 2020-07-26T03:50:10Z Article http://purl.org/eprint/type/JournalArticle 1755-8794 https://hdl.handle.net/1721.1/127190 Sotiraki, Katerina et al. "Privately computing set-maximal matches in genomic data." BMC Medical Genomics 13, 72 (July 2020): 72 © 2020 Springer Nature en http://dx.doi.org/10.1186/s12920-020-0718-x BMC Medical Genomics Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ The Author(s) application/pdf Springer Science and Business Media LLC BioMed Central
spellingShingle Sotiraki, Katerina
Ghosh, Esha
Chen, Hao
Privately computing set-maximal matches in genomic data
title Privately computing set-maximal matches in genomic data
title_full Privately computing set-maximal matches in genomic data
title_fullStr Privately computing set-maximal matches in genomic data
title_full_unstemmed Privately computing set-maximal matches in genomic data
title_short Privately computing set-maximal matches in genomic data
title_sort privately computing set maximal matches in genomic data
url https://hdl.handle.net/1721.1/127190
work_keys_str_mv AT sotirakikaterina privatelycomputingsetmaximalmatchesingenomicdata
AT ghoshesha privatelycomputingsetmaximalmatchesingenomicdata
AT chenhao privatelycomputingsetmaximalmatchesingenomicdata