Summary: | Conventional implementations of federated learning require a centralized entity to conduct and
coordinate the training with a star communication architecture. However, this technique is prone
to a single point of failure, e.g., when the central node is malicious. In this study, we explore
decentralized federated learning frameworks where clients communicate with each other following a peer-to-peer mechanism rather than server-client. We study how communication topology and model partitioning affects the throughput and convergence metrics in decentralized federated learning. To make our study as practically applicable as possible, we include network link latencies in our performance metrics for a fair evaluation. Through our study, we conclude that the ring communication mechanism has the highest throughput with the best convergence performance metrics. In big networks, ring is almost 8 times as fast as centralized communications.
|