VF-CART: A communication-efficient vertical federated framework for the CART algorithm
With growing concerns about privacy and the fact that data are distributed among multiple parties in realistic scenarios, vertical federated learning (VFL) is becoming increasingly important. There is an increasing trend in adapting machine learning algorithms to the VFL setting. As a category of pr...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-01-01
|
Series: | Journal of King Saud University: Computer and Information Sciences |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1319157822004116 |
_version_ | 1797942126265237504 |
---|---|
author | Yang Xu Xuexian Hu Jianghong Wei Hongjian Yang Kejia Li |
author_facet | Yang Xu Xuexian Hu Jianghong Wei Hongjian Yang Kejia Li |
author_sort | Yang Xu |
collection | DOAJ |
description | With growing concerns about privacy and the fact that data are distributed among multiple parties in realistic scenarios, vertical federated learning (VFL) is becoming increasingly important. There is an increasing trend in adapting machine learning algorithms to the VFL setting. As a category of prevalent machine learning algorithms, decision tree and random forests in VFL have attracted widespread interest. However, existing frameworks suffer either from potential privacy breaches or high communication consumption. To close this gap, we propose a communication-efficient vertical federated framework for the classification and regression tree (CART) algorithm called VF-CART, and extend it to random forests (RFs). Specifically, we convert feature values into bin values and build a histogram for each feature. By employing a hash function and homomorphic encryption to secretly choose the best split, a participant with labels cannot obtain the sample subsets for each split. In addition, the number of ciphertexts transmitted between entities is reduced significantly in both the training and prediction stages. Participants who do not have the labels communicated only once with the third-party server during the tree-building stage. During the prediction stage, only one ciphertext must be transmitted to predict a sample. Finally, we conducted experiments using both real-world and synthetic datasets. The experimental results demonstrate that the VF-CART algorithm significantly reduced the volume of communication. |
first_indexed | 2024-04-10T20:02:25Z |
format | Article |
id | doaj.art-4ca41cbc4e3b4fe398e9d488ba290a0f |
institution | Directory Open Access Journal |
issn | 1319-1578 |
language | English |
last_indexed | 2024-04-10T20:02:25Z |
publishDate | 2023-01-01 |
publisher | Elsevier |
record_format | Article |
series | Journal of King Saud University: Computer and Information Sciences |
spelling | doaj.art-4ca41cbc4e3b4fe398e9d488ba290a0f2023-01-27T04:18:44ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782023-01-01351237249VF-CART: A communication-efficient vertical federated framework for the CART algorithmYang Xu0Xuexian Hu1Jianghong Wei2Hongjian Yang3Kejia Li4State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China; Corresponding author.State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China; State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaWith growing concerns about privacy and the fact that data are distributed among multiple parties in realistic scenarios, vertical federated learning (VFL) is becoming increasingly important. There is an increasing trend in adapting machine learning algorithms to the VFL setting. As a category of prevalent machine learning algorithms, decision tree and random forests in VFL have attracted widespread interest. However, existing frameworks suffer either from potential privacy breaches or high communication consumption. To close this gap, we propose a communication-efficient vertical federated framework for the classification and regression tree (CART) algorithm called VF-CART, and extend it to random forests (RFs). Specifically, we convert feature values into bin values and build a histogram for each feature. By employing a hash function and homomorphic encryption to secretly choose the best split, a participant with labels cannot obtain the sample subsets for each split. In addition, the number of ciphertexts transmitted between entities is reduced significantly in both the training and prediction stages. Participants who do not have the labels communicated only once with the third-party server during the tree-building stage. During the prediction stage, only one ciphertext must be transmitted to predict a sample. Finally, we conducted experiments using both real-world and synthetic datasets. The experimental results demonstrate that the VF-CART algorithm significantly reduced the volume of communication.http://www.sciencedirect.com/science/article/pii/S1319157822004116Vertical federated learningCART decision treePrivacy preservationHomomorphic encryption |
spellingShingle | Yang Xu Xuexian Hu Jianghong Wei Hongjian Yang Kejia Li VF-CART: A communication-efficient vertical federated framework for the CART algorithm Journal of King Saud University: Computer and Information Sciences Vertical federated learning CART decision tree Privacy preservation Homomorphic encryption |
title | VF-CART: A communication-efficient vertical federated framework for the CART algorithm |
title_full | VF-CART: A communication-efficient vertical federated framework for the CART algorithm |
title_fullStr | VF-CART: A communication-efficient vertical federated framework for the CART algorithm |
title_full_unstemmed | VF-CART: A communication-efficient vertical federated framework for the CART algorithm |
title_short | VF-CART: A communication-efficient vertical federated framework for the CART algorithm |
title_sort | vf cart a communication efficient vertical federated framework for the cart algorithm |
topic | Vertical federated learning CART decision tree Privacy preservation Homomorphic encryption |
url | http://www.sciencedirect.com/science/article/pii/S1319157822004116 |
work_keys_str_mv | AT yangxu vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm AT xuexianhu vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm AT jianghongwei vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm AT hongjianyang vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm AT kejiali vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm |