Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data
The emerging development of cloud computing makes a trend that the cloud becomes a outsourced agglomeration for storing big data that generally contains numerous information. To mine the rich value involved in big data, the machine learning methodology is widespread employed due to its ability to ad...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8943957/ |
_version_ | 1818910234613121024 |
---|---|
author | Xiaoxia Dong Jie Chen Kai Zhang Haifeng Qian |
author_facet | Xiaoxia Dong Jie Chen Kai Zhang Haifeng Qian |
author_sort | Xiaoxia Dong |
collection | DOAJ |
description | The emerging development of cloud computing makes a trend that the cloud becomes a outsourced agglomeration for storing big data that generally contains numerous information. To mine the rich value involved in big data, the machine learning methodology is widespread employed due to its ability to adapt to data changes. However, the data mining process may involve the privacy issues of the users, hence they are reluctant to share their information. This is the reason why the outsourced data need to be dealt with securely, where data encryption is considered to be the most straightforward method to keep the privacy of data, but machine learning on the data in ciphertext domain is more complicated than the plaintext, since the relationship structure between data is no longer maintained, in such a way that we focus on the machine learning over encrypted big data. In this work, we study locally weighted linear regression (LWLR), a widely used classic machine learning algorithm in real-world, such as predict and find the best-fit curve through numerous data points. To tackle the privacy concerns in utilizing the LWLR algorithm, we present a system for privacy-preserving locally weighted linear regression, where the system not only protects the privacy of users but also encrypts the best-fit curve. Therefore, we use Paillier homomorphic encryption as the building modular to encrypt data and then apply the stochastic gradient descent in encrypted domain. After given a security analysis, we study how to let Paillier encryption deal with real numbers and implement the system in Python language with a couple of experiments on real-world data sets to evaluate the effectiveness, and show that it outperforms the state-of-the-art and occurs negligible errors compared with performing locally weighted linear regression in the clear. |
first_indexed | 2024-12-19T22:39:34Z |
format | Article |
id | doaj.art-0ad7b60841a744a4b0dd8268282dba9b |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T22:39:34Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-0ad7b60841a744a4b0dd8268282dba9b2022-12-21T20:03:07ZengIEEEIEEE Access2169-35362020-01-0182247225710.1109/ACCESS.2019.29627008943957Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of DataXiaoxia Dong0https://orcid.org/0000-0002-6590-628XJie Chen1https://orcid.org/0000-0001-6757-6416Kai Zhang2https://orcid.org/0000-0001-9728-4051Haifeng Qian3https://orcid.org/0000-0003-4920-5405Department of Computer Science and Technology, East China Normal University, Shanghai, ChinaSchool of Software Engineering, East China Normal University, Shanghai, ChinaSchool of Computer Science and Technology, Shanghai University of Electric Power, Shanghai, ChinaSchool of Software Engineering, East China Normal University, Shanghai, ChinaThe emerging development of cloud computing makes a trend that the cloud becomes a outsourced agglomeration for storing big data that generally contains numerous information. To mine the rich value involved in big data, the machine learning methodology is widespread employed due to its ability to adapt to data changes. However, the data mining process may involve the privacy issues of the users, hence they are reluctant to share their information. This is the reason why the outsourced data need to be dealt with securely, where data encryption is considered to be the most straightforward method to keep the privacy of data, but machine learning on the data in ciphertext domain is more complicated than the plaintext, since the relationship structure between data is no longer maintained, in such a way that we focus on the machine learning over encrypted big data. In this work, we study locally weighted linear regression (LWLR), a widely used classic machine learning algorithm in real-world, such as predict and find the best-fit curve through numerous data points. To tackle the privacy concerns in utilizing the LWLR algorithm, we present a system for privacy-preserving locally weighted linear regression, where the system not only protects the privacy of users but also encrypts the best-fit curve. Therefore, we use Paillier homomorphic encryption as the building modular to encrypt data and then apply the stochastic gradient descent in encrypted domain. After given a security analysis, we study how to let Paillier encryption deal with real numbers and implement the system in Python language with a couple of experiments on real-world data sets to evaluate the effectiveness, and show that it outperforms the state-of-the-art and occurs negligible errors compared with performing locally weighted linear regression in the clear.https://ieeexplore.ieee.org/document/8943957/Locally weighted linear regressionprivacy-preservingpaillier homomorphic encryptionstochastic gradient descent |
spellingShingle | Xiaoxia Dong Jie Chen Kai Zhang Haifeng Qian Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data IEEE Access Locally weighted linear regression privacy-preserving paillier homomorphic encryption stochastic gradient descent |
title | Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data |
title_full | Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data |
title_fullStr | Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data |
title_full_unstemmed | Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data |
title_short | Privacy-Preserving Locally Weighted Linear Regression Over Encrypted Millions of Data |
title_sort | privacy preserving locally weighted linear regression over encrypted millions of data |
topic | Locally weighted linear regression privacy-preserving paillier homomorphic encryption stochastic gradient descent |
url | https://ieeexplore.ieee.org/document/8943957/ |
work_keys_str_mv | AT xiaoxiadong privacypreservinglocallyweightedlinearregressionoverencryptedmillionsofdata AT jiechen privacypreservinglocallyweightedlinearregressionoverencryptedmillionsofdata AT kaizhang privacypreservinglocallyweightedlinearregressionoverencryptedmillionsofdata AT haifengqian privacypreservinglocallyweightedlinearregressionoverencryptedmillionsofdata |