Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning

We consider privacy-preserving learning in the context of online learning. Insettings where data instances arrive sequentially in streaming fashion, incremental trainingalgorithms such as stochastic gradient descent (SGD) can be used to learn and updateprediction models. When labels are costly to ac...

Full description

Bibliographic Details
Main Authors:	Daniel M Bittner, Alejandro E Brito, Mohsen Ghassemi, Shantanu Rane, Anand D Sarwate, Rebecca N Wright
Format:	Article
Language:	English
Published:	Labor Dynamics Institute 2020-06-01
Series:	The Journal of Privacy and Confidentiality
Subjects:	Differential Privacy Active Learning Anomaly Detection
Online Access:	https://journalprivacyconfidentiality.org/index.php/jpc/article/view/720

_version_	1811265816604180480
author	Daniel M Bittner Alejandro E Brito Mohsen Ghassemi Shantanu Rane Anand D Sarwate Rebecca N Wright
author_facet	Daniel M Bittner Alejandro E Brito Mohsen Ghassemi Shantanu Rane Anand D Sarwate Rebecca N Wright
author_sort	Daniel M Bittner
collection	DOAJ
description	We consider privacy-preserving learning in the context of online learning. Insettings where data instances arrive sequentially in streaming fashion, incremental trainingalgorithms such as stochastic gradient descent (SGD) can be used to learn and updateprediction models. When labels are costly to acquire, active learning methods can beused to select samples to be labeled from a stream of unlabeled data. These labeled datasamples are then used to update the machine learning models. Privacy-preserving onlinelearning can be used to update predictors on data streams containing sensitive information.The differential privacy framework quantifies the privacy risk in such settings. This workproposes a differentially private online active learning algorithm using stochastic gradientdescent (SGD) to retrain the classifiers. We propose two methods for selecting informativesamples. We incorporated this into a general-purpose web application that allows a non-expert user to evaluate the privacy-aware classifier and visualize key privacy-utility tradeoffs.Our application supports linear support vector machines and logistic regression and enablesan analyst to configure and visualize the effect of using differentially private online activelearning versus a non-private counterpart. The application is useful for comparing theprivacy/utility tradeoff of different algorithms, which can be useful to decision makers inchoosing which algorithms and parameters to use. Additionally, we use the application toevaluate our SGD-based solution and to show that it generates predictions with a superiorprivacy-utility tradeoff than earlier methods.
first_indexed	2024-04-12T20:30:45Z
format	Article
id	doaj.art-8ed838b64a1340a6b69cb186bc6ce0db
institution	Directory Open Access Journal
issn	2575-8527
language	English
last_indexed	2024-04-12T20:30:45Z
publishDate	2020-06-01
publisher	Labor Dynamics Institute
record_format	Article
series	The Journal of Privacy and Confidentiality
spelling	doaj.art-8ed838b64a1340a6b69cb186bc6ce0db2022-12-22T03:17:44ZengLabor Dynamics InstituteThe Journal of Privacy and Confidentiality2575-85272020-06-0110210.29012/jpc.720Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active LearningDaniel M Bittner0Alejandro E Brito1Mohsen Ghassemi2Shantanu Rane3Anand D Sarwate4Rebecca N Wright5Rutgers UniversityPalo Alto Research CenterRutgers UniversityPalo Alto Research CenterRutgers UniversityRutgers UniversityWe consider privacy-preserving learning in the context of online learning. Insettings where data instances arrive sequentially in streaming fashion, incremental trainingalgorithms such as stochastic gradient descent (SGD) can be used to learn and updateprediction models. When labels are costly to acquire, active learning methods can beused to select samples to be labeled from a stream of unlabeled data. These labeled datasamples are then used to update the machine learning models. Privacy-preserving onlinelearning can be used to update predictors on data streams containing sensitive information.The differential privacy framework quantifies the privacy risk in such settings. This workproposes a differentially private online active learning algorithm using stochastic gradientdescent (SGD) to retrain the classifiers. We propose two methods for selecting informativesamples. We incorporated this into a general-purpose web application that allows a non-expert user to evaluate the privacy-aware classifier and visualize key privacy-utility tradeoffs.Our application supports linear support vector machines and logistic regression and enablesan analyst to configure and visualize the effect of using differentially private online activelearning versus a non-private counterpart. The application is useful for comparing theprivacy/utility tradeoff of different algorithms, which can be useful to decision makers inchoosing which algorithms and parameters to use. Additionally, we use the application toevaluate our SGD-based solution and to show that it generates predictions with a superiorprivacy-utility tradeoff than earlier methods.https://journalprivacyconfidentiality.org/index.php/jpc/article/view/720Differential PrivacyActive LearningAnomaly Detection
spellingShingle	Daniel M Bittner Alejandro E Brito Mohsen Ghassemi Shantanu Rane Anand D Sarwate Rebecca N Wright Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning The Journal of Privacy and Confidentiality Differential Privacy Active Learning Anomaly Detection
title	Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning
title_full	Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning
title_fullStr	Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning
title_full_unstemmed	Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning
title_short	Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning
title_sort	understanding privacy utility tradeoffs in differentially private online active learning
topic	Differential Privacy Active Learning Anomaly Detection
url	https://journalprivacyconfidentiality.org/index.php/jpc/article/view/720
work_keys_str_mv	AT danielmbittner understandingprivacyutilitytradeoffsindifferentiallyprivateonlineactivelearning AT alejandroebrito understandingprivacyutilitytradeoffsindifferentiallyprivateonlineactivelearning AT mohsenghassemi understandingprivacyutilitytradeoffsindifferentiallyprivateonlineactivelearning AT shantanurane understandingprivacyutilitytradeoffsindifferentiallyprivateonlineactivelearning AT ananddsarwate understandingprivacyutilitytradeoffsindifferentiallyprivateonlineactivelearning AT rebeccanwright understandingprivacyutilitytradeoffsindifferentiallyprivateonlineactivelearning

Understanding Privacy-Utility Tradeoffs in Differentially Private Online Active Learning

Similar Items