Natural language processing on encrypted patient data
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/113438 |
_version_ | 1826211273010839552 |
---|---|
author | Grinman, Alex J |
author2 | Shafi Goldwasser. |
author_facet | Shafi Goldwasser. Grinman, Alex J |
author_sort | Grinman, Alex J |
collection | MIT |
description | Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. |
first_indexed | 2024-09-23T15:03:18Z |
format | Thesis |
id | mit-1721.1/113438 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T15:03:18Z |
publishDate | 2018 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1134382019-04-10T08:00:56Z Natural language processing on encrypted patient data Grinman, Alex J Shafi Goldwasser. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 85-86). While many industries can benefit from machine learning techniques for data analysis, they often do not have the technical expertise nor computational power to do so. Therefore, many organizations would benefit from outsourcing their data analysis. Yet, stringent data privacy policies prevent outsourcing sensitive data and may stop the delegation of data analysis in its tracks. In this thesis, we put forth a two-party system where one party capable of powerful computation can run certain machine learning algorithms from the natural language processing domain on the second party's data, where the first party is limited to learning only specific functions of the second party's data and nothing else. Our system provides simple cryptographic schemes for locating keywords, matching approximate regular expressions, and computing frequency analysis on encrypted data. We present a full implementation of this system in the form of a extendible software library and a command line interface. Finally, we discuss a medical case study where we used our system to run a suite of unmodified machine learning algorithms on encrypted free text patient notes. by Alex J. Grinman. M. Eng. 2018-02-08T15:57:44Z 2018-02-08T15:57:44Z 2016 2016 Thesis http://hdl.handle.net/1721.1/113438 1020068666 eng MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582 86 pages application/pdf Massachusetts Institute of Technology |
spellingShingle | Electrical Engineering and Computer Science. Grinman, Alex J Natural language processing on encrypted patient data |
title | Natural language processing on encrypted patient data |
title_full | Natural language processing on encrypted patient data |
title_fullStr | Natural language processing on encrypted patient data |
title_full_unstemmed | Natural language processing on encrypted patient data |
title_short | Natural language processing on encrypted patient data |
title_sort | natural language processing on encrypted patient data |
topic | Electrical Engineering and Computer Science. |
url | http://hdl.handle.net/1721.1/113438 |
work_keys_str_mv | AT grinmanalexj naturallanguageprocessingonencryptedpatientdata |