Privately computing set-maximal matches in genomic data
Background: Finding long matches in deoxyribonucleic acid (DNA) sequences in large aligned genetic sequences is a problem of great interest. A paradigmatic application is the identification of distant relatives via large common subsequences in DNA data. However, because of the sensitive nature of ge...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Springer Science and Business Media LLC
2020
|
Online Access: | https://hdl.handle.net/1721.1/127190 |
_version_ | 1811097932424806400 |
---|---|
author | Sotiraki, Katerina Ghosh, Esha Chen, Hao |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Sotiraki, Katerina Ghosh, Esha Chen, Hao |
author_sort | Sotiraki, Katerina |
collection | MIT |
description | Background: Finding long matches in deoxyribonucleic acid (DNA) sequences in large aligned genetic sequences is a problem of great interest. A paradigmatic application is the identification of distant relatives via large common subsequences in DNA data. However, because of the sensitive nature of genomic data such computations without security consideration might compromise the privacy of the individuals involved. Methods: The secret sharing technique enables the computation of matches while respecting the privacy of the inputs of the parties involved. This method requires interaction that depends on the circuit depth needed for the computation. Results: We design a new depth-optimized algorithm for computing set-maximal matches between a database of aligned genetic sequences and the DNA of an individual while respecting the privacy of both the database owner and the individual. We then implement and evaluate our protocol. Conclusions: Using modern cryptographic techniques, difficult genomic computations are performed in a privacy-preserving way. We enrich this research area by proposing a privacy-preserving protocol for set-maximal matches. |
first_indexed | 2024-09-23T17:07:17Z |
format | Article |
id | mit-1721.1/127190 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T17:07:17Z |
publishDate | 2020 |
publisher | Springer Science and Business Media LLC |
record_format | dspace |
spelling | mit-1721.1/1271902022-10-03T10:35:39Z Privately computing set-maximal matches in genomic data Sotiraki, Katerina Ghosh, Esha Chen, Hao Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Background: Finding long matches in deoxyribonucleic acid (DNA) sequences in large aligned genetic sequences is a problem of great interest. A paradigmatic application is the identification of distant relatives via large common subsequences in DNA data. However, because of the sensitive nature of genomic data such computations without security consideration might compromise the privacy of the individuals involved. Methods: The secret sharing technique enables the computation of matches while respecting the privacy of the inputs of the parties involved. This method requires interaction that depends on the circuit depth needed for the computation. Results: We design a new depth-optimized algorithm for computing set-maximal matches between a database of aligned genetic sequences and the DNA of an individual while respecting the privacy of both the database owner and the individual. We then implement and evaluate our protocol. Conclusions: Using modern cryptographic techniques, difficult genomic computations are performed in a privacy-preserving way. We enrich this research area by proposing a privacy-preserving protocol for set-maximal matches. 2020-09-04T21:10:13Z 2020-09-04T21:10:13Z 2020-07 2020-07-26T03:50:10Z Article http://purl.org/eprint/type/JournalArticle 1755-8794 https://hdl.handle.net/1721.1/127190 Sotiraki, Katerina et al. "Privately computing set-maximal matches in genomic data." BMC Medical Genomics 13, 72 (July 2020): 72 © 2020 Springer Nature en http://dx.doi.org/10.1186/s12920-020-0718-x BMC Medical Genomics Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ The Author(s) application/pdf Springer Science and Business Media LLC BioMed Central |
spellingShingle | Sotiraki, Katerina Ghosh, Esha Chen, Hao Privately computing set-maximal matches in genomic data |
title | Privately computing set-maximal matches in genomic data |
title_full | Privately computing set-maximal matches in genomic data |
title_fullStr | Privately computing set-maximal matches in genomic data |
title_full_unstemmed | Privately computing set-maximal matches in genomic data |
title_short | Privately computing set-maximal matches in genomic data |
title_sort | privately computing set maximal matches in genomic data |
url | https://hdl.handle.net/1721.1/127190 |
work_keys_str_mv | AT sotirakikaterina privatelycomputingsetmaximalmatchesingenomicdata AT ghoshesha privatelycomputingsetmaximalmatchesingenomicdata AT chenhao privatelycomputingsetmaximalmatchesingenomicdata |