Computational disclosure control : a primer on data privacy protection

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.

Bibliographic Details
Main Author: Sweeney, Latanya
Other Authors: Hal Abelson.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2005
Subjects:
Online Access:http://hdl.handle.net/1721.1/8589
_version_ 1811096926203936768
author Sweeney, Latanya
author2 Hal Abelson.
author_facet Hal Abelson.
Sweeney, Latanya
author_sort Sweeney, Latanya
collection MIT
description Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.
first_indexed 2024-09-23T16:51:33Z
format Thesis
id mit-1721.1/8589
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T16:51:33Z
publishDate 2005
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/85892019-04-10T10:27:30Z Computational disclosure control : a primer on data privacy protection Sweeney, Latanya Hal Abelson. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001. Includes bibliographical references (leaves 213-216) and index. Today's globally networked society places great demand on the dissemination and sharing of person specific data for many new and exciting uses. When these data are linked together, they provide an electronic shadow of a person or organization that is as identifying and personal as a fingerprint even when the information contains no explicit identifiers, such as name and phone number. Other distinctive data, such as birth date and ZIP code, often combine uniquely and can be linked to publicly available information to re-identify individuals. Producing anonymous data that remains specific enough to be useful is often a very difficult task and practice today tends to either incorrectly believe confidentiality is maintained when it is not or produces data that are practically useless. The goal of the work presented in this book is to explore computational techniques for releasing useful information in such a way that the identity of any individual or entity contained in data cannot be recognized while the data remain practically useful. I begin by demonstrating ways to learn information about entities from publicly available information. I then provide a formal framework for reasoning about disclosure control and the ability to infer the identities of entities contained within the data. I formally define and present null-map, k-map and wrong-map as models of protection. Each model provides protection by ensuring that released information maps to no, k or incorrect entities, respectively. The book ends by examining four computational systems that attempt to maintain privacy while releasing electronic information. These systems are: (1) my Scrub System, which locates personally-identifying information in letters between doctors and notes written by clinicians; (2) my Datafly II System, which generalizes and suppresses values in field-structured data sets; (3) Statistics Netherlands' pt-Argus System, which is becoming a European standard for producing public-use data; and, (4) my k-Similar algorithm, which finds optimal solutions such that data are minimally distorted while still providing adequate protection. By introducing anonymity and quality metrics, I show that Datafly II can overprotect data, Scrub and p-Argus can fail to provide adequate protection, but k-similar finds optimal results. by Latanya Sweeney. Ph.D. 2005-08-23T21:31:24Z 2005-08-23T21:31:24Z 2001 2001 Thesis http://hdl.handle.net/1721.1/8589 49279409 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 217 leaves 19145835 bytes 19145596 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Sweeney, Latanya
Computational disclosure control : a primer on data privacy protection
title Computational disclosure control : a primer on data privacy protection
title_full Computational disclosure control : a primer on data privacy protection
title_fullStr Computational disclosure control : a primer on data privacy protection
title_full_unstemmed Computational disclosure control : a primer on data privacy protection
title_short Computational disclosure control : a primer on data privacy protection
title_sort computational disclosure control a primer on data privacy protection
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/8589
work_keys_str_mv AT sweeneylatanya computationaldisclosurecontrolaprimerondataprivacyprotection