Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data

The emerging development of cloud computing makes a trend that the cloud becomes a outsourced agglomeration for storing big data that generally contains numerous information. To mine the rich value involved in big data, the machine learning methodology is widespread employed due to its ability to ad...

Full description

Bibliographic Details
Main Authors: Xiaoxia Dong, Jie Chen, Kai Zhang, Haifeng Qian
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8943957/
_version_ 1818910234613121024
author Xiaoxia Dong
Jie Chen
Kai Zhang
Haifeng Qian
author_facet Xiaoxia Dong
Jie Chen
Kai Zhang
Haifeng Qian
author_sort Xiaoxia Dong
collection DOAJ
description The emerging development of cloud computing makes a trend that the cloud becomes a outsourced agglomeration for storing big data that generally contains numerous information. To mine the rich value involved in big data, the machine learning methodology is widespread employed due to its ability to adapt to data changes. However, the data mining process may involve the privacy issues of the users, hence they are reluctant to share their information. This is the reason why the outsourced data need to be dealt with securely, where data encryption is considered to be the most straightforward method to keep the privacy of data, but machine learning on the data in ciphertext domain is more complicated than the plaintext, since the relationship structure between data is no longer maintained, in such a way that we focus on the machine learning over encrypted big data. In this work, we study locally weighted linear regression (LWLR), a widely used classic machine learning algorithm in real-world, such as predict and find the best-fit curve through numerous data points. To tackle the privacy concerns in utilizing the LWLR algorithm, we present a system for privacy-preserving locally weighted linear regression, where the system not only protects the privacy of users but also encrypts the best-fit curve. Therefore, we use Paillier homomorphic encryption as the building modular to encrypt data and then apply the stochastic gradient descent in encrypted domain. After given a security analysis, we study how to let Paillier encryption deal with real numbers and implement the system in Python language with a couple of experiments on real-world data sets to evaluate the effectiveness, and show that it outperforms the state-of-the-art and occurs negligible errors compared with performing locally weighted linear regression in the clear.
first_indexed 2024-12-19T22:39:34Z
format Article
id doaj.art-0ad7b60841a744a4b0dd8268282dba9b
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T22:39:34Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-0ad7b60841a744a4b0dd8268282dba9b2022-12-21T20:03:07ZengIEEEIEEE Access2169-35362020-01-0182247225710.1109/ACCESS.2019.29627008943957Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of DataXiaoxia Dong0https://orcid.org/0000-0002-6590-628XJie Chen1https://orcid.org/0000-0001-6757-6416Kai Zhang2https://orcid.org/0000-0001-9728-4051Haifeng Qian3https://orcid.org/0000-0003-4920-5405Department of Computer Science and Technology, East China Normal University, Shanghai, ChinaSchool of Software Engineering, East China Normal University, Shanghai, ChinaSchool of Computer Science and Technology, Shanghai University of Electric Power, Shanghai, ChinaSchool of Software Engineering, East China Normal University, Shanghai, ChinaThe emerging development of cloud computing makes a trend that the cloud becomes a outsourced agglomeration for storing big data that generally contains numerous information. To mine the rich value involved in big data, the machine learning methodology is widespread employed due to its ability to adapt to data changes. However, the data mining process may involve the privacy issues of the users, hence they are reluctant to share their information. This is the reason why the outsourced data need to be dealt with securely, where data encryption is considered to be the most straightforward method to keep the privacy of data, but machine learning on the data in ciphertext domain is more complicated than the plaintext, since the relationship structure between data is no longer maintained, in such a way that we focus on the machine learning over encrypted big data. In this work, we study locally weighted linear regression (LWLR), a widely used classic machine learning algorithm in real-world, such as predict and find the best-fit curve through numerous data points. To tackle the privacy concerns in utilizing the LWLR algorithm, we present a system for privacy-preserving locally weighted linear regression, where the system not only protects the privacy of users but also encrypts the best-fit curve. Therefore, we use Paillier homomorphic encryption as the building modular to encrypt data and then apply the stochastic gradient descent in encrypted domain. After given a security analysis, we study how to let Paillier encryption deal with real numbers and implement the system in Python language with a couple of experiments on real-world data sets to evaluate the effectiveness, and show that it outperforms the state-of-the-art and occurs negligible errors compared with performing locally weighted linear regression in the clear.https://ieeexplore.ieee.org/document/8943957/Locally weighted linear regressionprivacy-preservingpaillier homomorphic encryptionstochastic gradient descent
spellingShingle Xiaoxia Dong
Jie Chen
Kai Zhang
Haifeng Qian
Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data
IEEE Access
Locally weighted linear regression
privacy-preserving
paillier homomorphic encryption
stochastic gradient descent
title Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data
title_full Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data
title_fullStr Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data
title_full_unstemmed Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data
title_short Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data
title_sort privacy preserving locally weighted linear regression over encrypted millions of data
topic Locally weighted linear regression
privacy-preserving
paillier homomorphic encryption
stochastic gradient descent
url https://ieeexplore.ieee.org/document/8943957/
work_keys_str_mv AT xiaoxiadong privacypreservinglocallyweightedlinearregressionoverencryptedmillionsofdata
AT jiechen privacypreservinglocallyweightedlinearregressionoverencryptedmillionsofdata
AT kaizhang privacypreservinglocallyweightedlinearregressionoverencryptedmillionsofdata
AT haifengqian privacypreservinglocallyweightedlinearregressionoverencryptedmillionsofdata