Joint client-and-sample selection for federated learning via bi-level optimization

Federated Learning (FL) enables massive local data owners to collaboratively train a deep learning model without disclosing their private data. The importance of local data samples from various data owners to FL models varies widely. This is exacerbated by the presence of noisy data that exhibit lar...

Full description

Bibliographic Details
Main Authors: Li, Anran, Wang, Guangjing, Hu, Ming, Sun, Jianfei, Zhang, Lan, Tuan, Luu Anh, Yu, Han
Other Authors: School of Computer Science and Engineering
Format: Journal Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181061
_version_ 1824456499187941376
author Li, Anran
Wang, Guangjing
Hu, Ming
Sun, Jianfei
Zhang, Lan
Tuan, Luu Anh
Yu, Han
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Li, Anran
Wang, Guangjing
Hu, Ming
Sun, Jianfei
Zhang, Lan
Tuan, Luu Anh
Yu, Han
author_sort Li, Anran
collection NTU
description Federated Learning (FL) enables massive local data owners to collaboratively train a deep learning model without disclosing their private data. The importance of local data samples from various data owners to FL models varies widely. This is exacerbated by the presence of noisy data that exhibit large losses similar to important (hard) samples. Currently, there lacks an FL approach that can effectively distinguish hard samples (which are beneficial) from noisy samples (which are harmful). To bridge this gap, we propose the joint Federated Meta-Weighting based Client and Sample Selection (FedMW-CSS) approach to simultaneously mitigate label noise and hard sample selection. It is a bilevel optimization approach for FL client-and-sample selection and global model construction to achieve hard sample-aware noise-robust learning in a privacy preserving manner. It performs meta-learning based online approximation to iteratively update global FL models, select the most positively influential samples and deal with training data noise. To utilize both the instance-level information and class-level information for better performance improvements, FedMW-CSS efficiently learns a class-level weight by manipulating gradients at the class level, e.g., it performs a gradient descent step on class-level weights, which only relies on intermediate gradients. Theoretically, we analyze the privacy guarantees and convergence of FedMW-CSS. Extensive experiments comparison against eight state-of-the-art baselines on six real-world datasets in the presence of data noise and heterogeneity shows that FedMW-CSS achieves up to 28.5% higher test accuracy, while saving communication and computation costs by at least 49.3% and 1.2%, respectively.
first_indexed 2025-02-19T03:55:04Z
format Journal Article
id ntu-10356/181061
institution Nanyang Technological University
language English
last_indexed 2025-02-19T03:55:04Z
publishDate 2024
record_format dspace
spelling ntu-10356/1810612024-11-13T00:06:44Z Joint client-and-sample selection for federated learning via bi-level optimization Li, Anran Wang, Guangjing Hu, Ming Sun, Jianfei Zhang, Lan Tuan, Luu Anh Yu, Han School of Computer Science and Engineering Computer and Information Science Bi-level optimization Federated learning Federated Learning (FL) enables massive local data owners to collaboratively train a deep learning model without disclosing their private data. The importance of local data samples from various data owners to FL models varies widely. This is exacerbated by the presence of noisy data that exhibit large losses similar to important (hard) samples. Currently, there lacks an FL approach that can effectively distinguish hard samples (which are beneficial) from noisy samples (which are harmful). To bridge this gap, we propose the joint Federated Meta-Weighting based Client and Sample Selection (FedMW-CSS) approach to simultaneously mitigate label noise and hard sample selection. It is a bilevel optimization approach for FL client-and-sample selection and global model construction to achieve hard sample-aware noise-robust learning in a privacy preserving manner. It performs meta-learning based online approximation to iteratively update global FL models, select the most positively influential samples and deal with training data noise. To utilize both the instance-level information and class-level information for better performance improvements, FedMW-CSS efficiently learns a class-level weight by manipulating gradients at the class level, e.g., it performs a gradient descent step on class-level weights, which only relies on intermediate gradients. Theoretically, we analyze the privacy guarantees and convergence of FedMW-CSS. Extensive experiments comparison against eight state-of-the-art baselines on six real-world datasets in the presence of data noise and heterogeneity shows that FedMW-CSS achieves up to 28.5% higher test accuracy, while saving communication and computation costs by at least 49.3% and 1.2%, respectively. Agency for Science, Technology and Research (A*STAR) Nanyang Technological University National Research Foundation (NRF) This research was supported in part by Nanyang Technological University (NTU), under Grant 020724-00001, in part by RIE2025 Industry Alignment Fund. Industry Collaboration Projects (IAF-ICP) under Grant I2301E0026, administered by A*STAR, as well as supported by in part by Alibaba Group and NTU Singapore, National Research Foundation, Singapore and DSO National Laboratories under the AI Singapore Programme AISG under Grant AISG2-RP-2020-019, in part by the National Key R&D Program of China under Grant 2021YFB2900103, in part by China National Natural Science Foundation under Grant 61932016, and in part by “The Fundamental Research Funds for the Central Universities” under Grant WK2150110024. 2024-11-13T00:06:44Z 2024-11-13T00:06:44Z 2024 Journal Article Li, A., Wang, G., Hu, M., Sun, J., Zhang, L., Tuan, L. A. & Yu, H. (2024). Joint client-and-sample selection for federated learning via bi-level optimization. IEEE Transactions On Mobile Computing, 23(12), 15196-15209. https://dx.doi.org/10.1109/TMC.2024.3455331 1536-1233 https://hdl.handle.net/10356/181061 10.1109/TMC.2024.3455331 2-s2.0-85203416646 12 23 15196 15209 en 020724-00001 I2301E0026 AISG2-RP-2020-019 IEEE Transactions on Mobile Computing © 2024 IEEE. All rights reserved.
spellingShingle Computer and Information Science
Bi-level optimization
Federated learning
Li, Anran
Wang, Guangjing
Hu, Ming
Sun, Jianfei
Zhang, Lan
Tuan, Luu Anh
Yu, Han
Joint client-and-sample selection for federated learning via bi-level optimization
title Joint client-and-sample selection for federated learning via bi-level optimization
title_full Joint client-and-sample selection for federated learning via bi-level optimization
title_fullStr Joint client-and-sample selection for federated learning via bi-level optimization
title_full_unstemmed Joint client-and-sample selection for federated learning via bi-level optimization
title_short Joint client-and-sample selection for federated learning via bi-level optimization
title_sort joint client and sample selection for federated learning via bi level optimization
topic Computer and Information Science
Bi-level optimization
Federated learning
url https://hdl.handle.net/10356/181061
work_keys_str_mv AT lianran jointclientandsampleselectionforfederatedlearningviabileveloptimization
AT wangguangjing jointclientandsampleselectionforfederatedlearningviabileveloptimization
AT huming jointclientandsampleselectionforfederatedlearningviabileveloptimization
AT sunjianfei jointclientandsampleselectionforfederatedlearningviabileveloptimization
AT zhanglan jointclientandsampleselectionforfederatedlearningviabileveloptimization
AT tuanluuanh jointclientandsampleselectionforfederatedlearningviabileveloptimization
AT yuhan jointclientandsampleselectionforfederatedlearningviabileveloptimization