Summary: | This thesis studies random walks and its algorithmic applications in distributed networks. Random walk is a fundamental technique which has found numerous applications in computer science, mathematics, statistics, physics, and among others. The simulation of random walks in a network is an important tool, with a lot of applications in algorithms and complexity theory. In particular, in communication networks, random walks have been used in various applications including token management, load balancing, network topology discovery and construction, search, and peer-to-peer membership management, local and lightweight algorithms for dynamic networks etc. While several such algorithms are ubiquitous, and use random walk sampling, the walks themselves have always been performed in a naive way --- simply passing the random walk token from one node to its neighbor. Thus, to perform a random walk of length $\ell$, the naive approach requires $\ell$ steps. Therefore, a natural question is whether a better algorithm is possible in the distributed model. In this thesis, we focus on two main questions: (1) How efficiently random walks can be performed in distributed networks? (2) How random walks can be used in designing efficient distributed algorithms for important distributed computing problems? Towards the first question, the thesis studies efficient distributed random walk sampling in networks, where we focus on both static and dynamic networks. For static networks, we present a round and message optimal algorithm which can be used to output several random walk samples in a continuous online fashion. This significantly improves upon both the naive technique that requires linear time and messages, and the sophisticated algorithm presented by Das Sarma et al. \cite{drw-jacm} which has the same sub-linear (quadratic) running time as our algorithm, but requires a large number of messages (depending on the number of edges in the network). Moreover, we perform a comprehensive experimental evaluation on numerous network topologies which proves the effectiveness and efficiency of our algorithm. For dynamic networks, we present a fast distributed algorithm for performing random walks. Our algorithm is the first-known algorithm that provably speeds up random walks in dynamic networks. Furthermore, we show a key application of this random walk algorithm for the fundamental problem of information spreading (a.k.a gossip) in dynamic networks. We use our random walk algorithm to obtain a fast distributed information spreading algorithm. Towards the second question, we study two important algorithmic applications: (1) Distributed computation of PageRank (2) Distributed computation of sparse cuts. PageRank has emerged as a powerful measure of relative importance of nodes in a network. In distributed computing, PageRank vectors have been used for several different applications ranging from determining important nodes, load balancing, search, and identifying connectivity structures. We devise random walk-based algorithms for computing PageRank and prove strong bounds on the round complexity. Our algorithms are the first and fully distributed algorithms for computing PageRank with provably efficient running time. Finding sparse cuts is an important tool in analyzing large-scale distributed networks such as the Internet and Peer-to-Peer networks, as well as large-scale graphs such as the web graph, online social communities, and VLSI circuits. Sparse cuts are particularly useful in graph clustering and partitioning. We develop fast distributed algorithms for computing sparse cuts in distributed networks, where random walks are used as a key ingredient.
|