Summary: | We study the trade-off between storage overhead and inter-cluster repair bandwidth in clustered storage systems, while recovering from multiple node failures within a cluster. A cluster is a collection of m nodes, and there are n clusters. For data collection, we download the entire content from any k clusters. For repair of t ≥ 2 nodes within a cluster, we take help from ℓ local nodes, as well as d helper clusters. We characterize the optimal trade-off under functional repair, and also under exact repair for the minimum storage and minimum inter-cluster bandwidth (MBR) operating points. Our bounds show the following interesting facts: 1) When t(m - ℓ) the tradeoff is the same as that under t = 1, and thus there is no advantage in jointly repairing multiple nodes, 2) When t (m - ℓ), the optimal file-size at the MBR point under exact repair can be strictly less than that under functional repair. 3) Unlike the case of t = 1, increasing the number of local helper nodes does not necessarily increase the system capacity under functional repair.
|