Load Balancing in NetApp’s Clustered Storage Systems

To address the problem of load balancing in NetApp’s storage system, this thesis aims to design and implement an algorithm that results in more evenly distributed cluster reconfigurations with minimal disturbance to clients’ workloads. I implement three different greedy algorithms to find a more bal...

Full description

Bibliographic Details
Main Author: Tran, Tho
Other Authors: McKenna, Michael
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139065
Description
Summary:To address the problem of load balancing in NetApp’s storage system, this thesis aims to design and implement an algorithm that results in more evenly distributed cluster reconfigurations with minimal disturbance to clients’ workloads. I implement three different greedy algorithms to find a more balanced workload-node assignment that lowers the maximum number of operations across the cluster. To analyze the performance of the greedy algorithms, I compare their results with those of the evolutionary and brute force algorithms. I also examine whether clusters’ characteristics affect the algorithms’ performance. The key findings are that the greedy algorithm with the advanced heuristic outperforms or does as well as the naive and intermediate greedy algorithms in five clusters that are representative of NetApp data. However, the tradeoff is that advanced greedy algorithm takes more time to run and costs more migration moves, thus causing NetApp clients or support engineers the inconvenience of manually moving multiple workloads. On the other hand, the naive greedy algorithm performs well on large clusters that primarily have small, non-dominating workloads but is more likely to get stuck in local minimums when it comes to small clusters that have one or more dominating workloads. The intermediate algorithm performs as well as the naive greedy algorithm in these clusters. Finally, the evolutionary algorithm is suitable for clusters with fewer nodes and workloads. Based on these findings, it is recommended that NetApp should use the naive greedy algorithm to balance large clusters that mostly have small, non-dominating workloads. If clusters have one or more large, dominating workloads, then it is best to use the advanced greedy algorithm to do load balancing.