Private k-means clustering : algorithms and applications

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.

Bibliographic Details
Main Author: Xiang, Chongyuan
Other Authors: Daniela Rus.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2017
Subjects:
Online Access:http://hdl.handle.net/1721.1/106394
_version_ 1811096511882199040
author Xiang, Chongyuan
author2 Daniela Rus.
author_facet Daniela Rus.
Xiang, Chongyuan
author_sort Xiang, Chongyuan
collection MIT
description Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
first_indexed 2024-09-23T16:44:51Z
format Thesis
id mit-1721.1/106394
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T16:44:51Z
publishDate 2017
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1063942019-04-12T17:07:30Z Private k-means clustering : algorithms and applications Xiang, Chongyuan Daniela Rus. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 77-80). Today is a new era of big data. We contribute our personal data for the common good simply by using our smart phones, searching the web and doing online transactions. Researchers, companies and governments use the collected data to learn various user behavior patterns and make impactful decisions based on that. Is it possible to publish and run queries on those databases without disclosing information about any specific individual? Differential privacy is a strong notion of privacy which guarantees that very little will be learned about individual records in the database, no matter what the attackers already know or wish to learn. Still, there is no practical system applying differential privacy algorithms for clustering points on real databases. This thesis describes the construction of small coresets for computing k-means clustering of a set of points while preserving differential privacy. As a result, it gives the first 𝑘-means clustering algorithm that is both differentially private, and has an approximation error that depends sub-linearly on the data’s dimension d. Previous results introduced errors that are exponential in d. This thesis implements this algorithm and uses it to create differentially private location data from GPS tracks. Specifically the algorithm allows clustering GPS databases generated from mobile nodes, while letting the user control the introduced noise due to privacy. This thesis also provides experimental results for the system and algorithms, and compares them to existing techniques. To the best of my knowledge, this is the first practical system that enables differentially private clustering on real data. by Chongyuan Xiang. M. Eng. 2017-01-12T18:19:00Z 2017-01-12T18:19:00Z 2016 2016 Thesis http://hdl.handle.net/1721.1/106394 967666900 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 80 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Xiang, Chongyuan
Private k-means clustering : algorithms and applications
title Private k-means clustering : algorithms and applications
title_full Private k-means clustering : algorithms and applications
title_fullStr Private k-means clustering : algorithms and applications
title_full_unstemmed Private k-means clustering : algorithms and applications
title_short Private k-means clustering : algorithms and applications
title_sort private k means clustering algorithms and applications
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/106394
work_keys_str_mv AT xiangchongyuan privatekmeansclusteringalgorithmsandapplications