VF-CART: A communication-efficient vertical federated framework for the CART algorithm

With growing concerns about privacy and the fact that data are distributed among multiple parties in realistic scenarios, vertical federated learning (VFL) is becoming increasingly important. There is an increasing trend in adapting machine learning algorithms to the VFL setting. As a category of pr...

Full description

Bibliographic Details
Main Authors: Yang Xu, Xuexian Hu, Jianghong Wei, Hongjian Yang, Kejia Li
Format: Article
Language:English
Published: Elsevier 2023-01-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1319157822004116
_version_ 1797942126265237504
author Yang Xu
Xuexian Hu
Jianghong Wei
Hongjian Yang
Kejia Li
author_facet Yang Xu
Xuexian Hu
Jianghong Wei
Hongjian Yang
Kejia Li
author_sort Yang Xu
collection DOAJ
description With growing concerns about privacy and the fact that data are distributed among multiple parties in realistic scenarios, vertical federated learning (VFL) is becoming increasingly important. There is an increasing trend in adapting machine learning algorithms to the VFL setting. As a category of prevalent machine learning algorithms, decision tree and random forests in VFL have attracted widespread interest. However, existing frameworks suffer either from potential privacy breaches or high communication consumption. To close this gap, we propose a communication-efficient vertical federated framework for the classification and regression tree (CART) algorithm called VF-CART, and extend it to random forests (RFs). Specifically, we convert feature values into bin values and build a histogram for each feature. By employing a hash function and homomorphic encryption to secretly choose the best split, a participant with labels cannot obtain the sample subsets for each split. In addition, the number of ciphertexts transmitted between entities is reduced significantly in both the training and prediction stages. Participants who do not have the labels communicated only once with the third-party server during the tree-building stage. During the prediction stage, only one ciphertext must be transmitted to predict a sample. Finally, we conducted experiments using both real-world and synthetic datasets. The experimental results demonstrate that the VF-CART algorithm significantly reduced the volume of communication.
first_indexed 2024-04-10T20:02:25Z
format Article
id doaj.art-4ca41cbc4e3b4fe398e9d488ba290a0f
institution Directory Open Access Journal
issn 1319-1578
language English
last_indexed 2024-04-10T20:02:25Z
publishDate 2023-01-01
publisher Elsevier
record_format Article
series Journal of King Saud University: Computer and Information Sciences
spelling doaj.art-4ca41cbc4e3b4fe398e9d488ba290a0f2023-01-27T04:18:44ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782023-01-01351237249VF-CART: A communication-efficient vertical federated framework for the CART algorithmYang Xu0Xuexian Hu1Jianghong Wei2Hongjian Yang3Kejia Li4State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China; Corresponding author.State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China; State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaWith growing concerns about privacy and the fact that data are distributed among multiple parties in realistic scenarios, vertical federated learning (VFL) is becoming increasingly important. There is an increasing trend in adapting machine learning algorithms to the VFL setting. As a category of prevalent machine learning algorithms, decision tree and random forests in VFL have attracted widespread interest. However, existing frameworks suffer either from potential privacy breaches or high communication consumption. To close this gap, we propose a communication-efficient vertical federated framework for the classification and regression tree (CART) algorithm called VF-CART, and extend it to random forests (RFs). Specifically, we convert feature values into bin values and build a histogram for each feature. By employing a hash function and homomorphic encryption to secretly choose the best split, a participant with labels cannot obtain the sample subsets for each split. In addition, the number of ciphertexts transmitted between entities is reduced significantly in both the training and prediction stages. Participants who do not have the labels communicated only once with the third-party server during the tree-building stage. During the prediction stage, only one ciphertext must be transmitted to predict a sample. Finally, we conducted experiments using both real-world and synthetic datasets. The experimental results demonstrate that the VF-CART algorithm significantly reduced the volume of communication.http://www.sciencedirect.com/science/article/pii/S1319157822004116Vertical federated learningCART decision treePrivacy preservationHomomorphic encryption
spellingShingle Yang Xu
Xuexian Hu
Jianghong Wei
Hongjian Yang
Kejia Li
VF-CART: A communication-efficient vertical federated framework for the CART algorithm
Journal of King Saud University: Computer and Information Sciences
Vertical federated learning
CART decision tree
Privacy preservation
Homomorphic encryption
title VF-CART: A communication-efficient vertical federated framework for the CART algorithm
title_full VF-CART: A communication-efficient vertical federated framework for the CART algorithm
title_fullStr VF-CART: A communication-efficient vertical federated framework for the CART algorithm
title_full_unstemmed VF-CART: A communication-efficient vertical federated framework for the CART algorithm
title_short VF-CART: A communication-efficient vertical federated framework for the CART algorithm
title_sort vf cart a communication efficient vertical federated framework for the cart algorithm
topic Vertical federated learning
CART decision tree
Privacy preservation
Homomorphic encryption
url http://www.sciencedirect.com/science/article/pii/S1319157822004116
work_keys_str_mv AT yangxu vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm
AT xuexianhu vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm
AT jianghongwei vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm
AT hongjianyang vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm
AT kejiali vfcartacommunicationefficientverticalfederatedframeworkforthecartalgorithm