Private k-means clustering : algorithms and applications
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/106394 |
_version_ | 1811096511882199040 |
---|---|
author | Xiang, Chongyuan |
author2 | Daniela Rus. |
author_facet | Daniela Rus. Xiang, Chongyuan |
author_sort | Xiang, Chongyuan |
collection | MIT |
description | Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. |
first_indexed | 2024-09-23T16:44:51Z |
format | Thesis |
id | mit-1721.1/106394 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T16:44:51Z |
publishDate | 2017 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1063942019-04-12T17:07:30Z Private k-means clustering : algorithms and applications Xiang, Chongyuan Daniela Rus. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 77-80). Today is a new era of big data. We contribute our personal data for the common good simply by using our smart phones, searching the web and doing online transactions. Researchers, companies and governments use the collected data to learn various user behavior patterns and make impactful decisions based on that. Is it possible to publish and run queries on those databases without disclosing information about any specific individual? Differential privacy is a strong notion of privacy which guarantees that very little will be learned about individual records in the database, no matter what the attackers already know or wish to learn. Still, there is no practical system applying differential privacy algorithms for clustering points on real databases. This thesis describes the construction of small coresets for computing k-means clustering of a set of points while preserving differential privacy. As a result, it gives the first 𝑘-means clustering algorithm that is both differentially private, and has an approximation error that depends sub-linearly on the data’s dimension d. Previous results introduced errors that are exponential in d. This thesis implements this algorithm and uses it to create differentially private location data from GPS tracks. Specifically the algorithm allows clustering GPS databases generated from mobile nodes, while letting the user control the introduced noise due to privacy. This thesis also provides experimental results for the system and algorithms, and compares them to existing techniques. To the best of my knowledge, this is the first practical system that enables differentially private clustering on real data. by Chongyuan Xiang. M. Eng. 2017-01-12T18:19:00Z 2017-01-12T18:19:00Z 2016 2016 Thesis http://hdl.handle.net/1721.1/106394 967666900 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 80 pages application/pdf Massachusetts Institute of Technology |
spellingShingle | Electrical Engineering and Computer Science. Xiang, Chongyuan Private k-means clustering : algorithms and applications |
title | Private k-means clustering : algorithms and applications |
title_full | Private k-means clustering : algorithms and applications |
title_fullStr | Private k-means clustering : algorithms and applications |
title_full_unstemmed | Private k-means clustering : algorithms and applications |
title_short | Private k-means clustering : algorithms and applications |
title_sort | private k means clustering algorithms and applications |
topic | Electrical Engineering and Computer Science. |
url | http://hdl.handle.net/1721.1/106394 |
work_keys_str_mv | AT xiangchongyuan privatekmeansclusteringalgorithmsandapplications |