BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data

The application of federated learning on ensemble methods is a common practice with the goal of increasing the predictive power of local models. However, although existing federated solutions utilizing ensemble methods can achieve this when the datasets of sites are balanced and of good quality, i.e...

Full description

Bibliographic Details
Main Authors: Mert Gencturk, A. Anil Sinaci, Nihan Kesim Cicekli
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9867984/
_version_ 1811182732480348160
author Mert Gencturk
A. Anil Sinaci
Nihan Kesim Cicekli
author_facet Mert Gencturk
A. Anil Sinaci
Nihan Kesim Cicekli
author_sort Mert Gencturk
collection DOAJ
description The application of federated learning on ensemble methods is a common practice with the goal of increasing the predictive power of local models. However, although existing federated solutions utilizing ensemble methods can achieve this when the datasets of sites are balanced and of good quality, i.e., the local models are already above a certain accuracy threshold, they usually fail to provide the same level of improvement to the models of sites that have an unsuccessful classifier because of their poor quality or imbalanced data. To address this challenge, we propose a novel federated ensemble classification algorithm for horizontally partitioned data, namely Boosting-based Federated Random Forest (BOFRF), which not only increases the predictive power of all participating sites, but also provides significantly high improvement on the predictive power of sites having unsuccessful local models. We implement a federated version of random forest, which is a well-known bagging algorithm, by adapting the idea of boosting to it. We introduce a novel aggregation and weight calculation methodology that assigns weights to local classifiers based on their classification performance at each site without increasing the communication or computation cost. We evaluate the performance of our proposed algorithm in different federated environments that we set up by using four healthcare datasets. The empirical results show that BOFRF improves the predictive power of local random forest models in all cases. The advantage of BOFRF is that the level of improvement it provides for sites having unsuccessful local models is significantly high unlike existing solutions.
first_indexed 2024-04-11T09:35:52Z
format Article
id doaj.art-9238427447314c46900b2c47b2ec63b0
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T09:35:52Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-9238427447314c46900b2c47b2ec63b02022-12-22T04:31:41ZengIEEEIEEE Access2169-35362022-01-0110898358985110.1109/ACCESS.2022.32020089867984BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned DataMert Gencturk0https://orcid.org/0000-0003-2697-5722A. Anil Sinaci1https://orcid.org/0000-0003-4397-3382Nihan Kesim Cicekli2Computer Engineering Department, Middle East Technical University, Ankara, TurkeySRDC Software Research & Development and Consultancy Corporation, ODTU Teknokent, Ankara, TurkeyComputer Engineering Department, Middle East Technical University, Ankara, TurkeyThe application of federated learning on ensemble methods is a common practice with the goal of increasing the predictive power of local models. However, although existing federated solutions utilizing ensemble methods can achieve this when the datasets of sites are balanced and of good quality, i.e., the local models are already above a certain accuracy threshold, they usually fail to provide the same level of improvement to the models of sites that have an unsuccessful classifier because of their poor quality or imbalanced data. To address this challenge, we propose a novel federated ensemble classification algorithm for horizontally partitioned data, namely Boosting-based Federated Random Forest (BOFRF), which not only increases the predictive power of all participating sites, but also provides significantly high improvement on the predictive power of sites having unsuccessful local models. We implement a federated version of random forest, which is a well-known bagging algorithm, by adapting the idea of boosting to it. We introduce a novel aggregation and weight calculation methodology that assigns weights to local classifiers based on their classification performance at each site without increasing the communication or computation cost. We evaluate the performance of our proposed algorithm in different federated environments that we set up by using four healthcare datasets. The empirical results show that BOFRF improves the predictive power of local random forest models in all cases. The advantage of BOFRF is that the level of improvement it provides for sites having unsuccessful local models is significantly high unlike existing solutions.https://ieeexplore.ieee.org/document/9867984/Ensemble learningfederated learningmachine learningprivacy-preservationrandom forest classification
spellingShingle Mert Gencturk
A. Anil Sinaci
Nihan Kesim Cicekli
BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data
IEEE Access
Ensemble learning
federated learning
machine learning
privacy-preservation
random forest classification
title BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data
title_full BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data
title_fullStr BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data
title_full_unstemmed BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data
title_short BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data
title_sort bofrf a novel boosting based federated random forest algorithm on horizontally partitioned data
topic Ensemble learning
federated learning
machine learning
privacy-preservation
random forest classification
url https://ieeexplore.ieee.org/document/9867984/
work_keys_str_mv AT mertgencturk bofrfanovelboostingbasedfederatedrandomforestalgorithmonhorizontallypartitioneddata
AT aanilsinaci bofrfanovelboostingbasedfederatedrandomforestalgorithmonhorizontallypartitioneddata
AT nihankesimcicekli bofrfanovelboostingbasedfederatedrandomforestalgorithmonhorizontallypartitioneddata