Executing dynamic data-graph computations deterministically using chromatic scheduling

A data-graph computation — popularized by such programming systems as Galois, Pregel, GraphLab, PowerGraph, and GraphChi — is an algorithm that performs local updates on the vertices of a graph. During each round of a data-graph computation, an update function atomically modifies the data associated...

Full description

Bibliographic Details
Main Authors: Leiserson, Charles E., Kaler, Timothy, Hasenplaugh, William Cleaburn, Schardl, Tao Benjamin
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Association for Computing Machinery (ACM) 2016
Online Access:http://hdl.handle.net/1721.1/100928
https://orcid.org/0000-0001-6915-0216
https://orcid.org/0000-0002-3831-8255
https://orcid.org/0000-0003-0198-3283
_version_ 1826209142712303616
author Leiserson, Charles E.
Kaler, Timothy
Hasenplaugh, William Cleaburn
Schardl, Tao Benjamin
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Leiserson, Charles E.
Kaler, Timothy
Hasenplaugh, William Cleaburn
Schardl, Tao Benjamin
author_sort Leiserson, Charles E.
collection MIT
description A data-graph computation — popularized by such programming systems as Galois, Pregel, GraphLab, PowerGraph, and GraphChi — is an algorithm that performs local updates on the vertices of a graph. During each round of a data-graph computation, an update function atomically modifies the data associated with a vertex as a function of the vertex's prior data and that of adjacent vertices. A dynamic data-graph computation updates only an active subset of the vertices during a round, and those updates determine the set of active vertices for the next round. This paper introduces PRISM, a chromatic-scheduling algorithm for executing dynamic data-graph computations. PRISM uses a vertex-coloring of the graph to coordinate updates performed in a round, precluding the need for mutual-exclusion locks or other nondeterministic data synchronization. A multibag data structure is used by PRISM to maintain a dynamic set of active vertices as an unordered set partitioned by color. We analyze PRISM using work-span analysis. Let G=(V,E) be a degree-Δ graph colored with Χ colors, and suppose that Q⊆V is the set of active vertices in a round. Define size(Q)=[Q] + Σ[subscript v∈Q]deg(v), which is proportional to the space required to store the vertices of Q using a sparse-graph layout. We show that a P-processor execution of PRISM performs updates in Q using O(Χ(lg (Q/Χ)+lgΔ)+ lgP) span and Θ(size(Q)+Χ+P) work. These theoretical guarantees are matched by good empirical performance. We modified GraphLab to incorporate PRISM and studied seven application benchmarks on a 12-core multicore machine. PRISM executes the benchmarks 1.2–2.1 times faster than GraphLab's nondeterministic lock-based scheduler while providing deterministic behavior. This paper also presents PRISM-R, a variation of PRISM that executes dynamic data-graph computations deterministically even when updates modify global variables with associative operations. PRISM-R satisfies the same theoretical bounds as PRISM, but its implementation is more involved, incorporating a multivector data structure to maintain an ordered set of vertices partitioned by color.
first_indexed 2024-09-23T14:18:14Z
format Article
id mit-1721.1/100928
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T14:18:14Z
publishDate 2016
publisher Association for Computing Machinery (ACM)
record_format dspace
spelling mit-1721.1/1009282022-09-29T08:32:57Z Executing dynamic data-graph computations deterministically using chromatic scheduling Leiserson, Charles E. Kaler, Timothy Hasenplaugh, William Cleaburn Schardl, Tao Benjamin Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Kaler, Timothy Hasenplaugh, William Cleaburn Schardl, Tao Benjamin Leiserson, Charles E. A data-graph computation — popularized by such programming systems as Galois, Pregel, GraphLab, PowerGraph, and GraphChi — is an algorithm that performs local updates on the vertices of a graph. During each round of a data-graph computation, an update function atomically modifies the data associated with a vertex as a function of the vertex's prior data and that of adjacent vertices. A dynamic data-graph computation updates only an active subset of the vertices during a round, and those updates determine the set of active vertices for the next round. This paper introduces PRISM, a chromatic-scheduling algorithm for executing dynamic data-graph computations. PRISM uses a vertex-coloring of the graph to coordinate updates performed in a round, precluding the need for mutual-exclusion locks or other nondeterministic data synchronization. A multibag data structure is used by PRISM to maintain a dynamic set of active vertices as an unordered set partitioned by color. We analyze PRISM using work-span analysis. Let G=(V,E) be a degree-Δ graph colored with Χ colors, and suppose that Q⊆V is the set of active vertices in a round. Define size(Q)=[Q] + Σ[subscript v∈Q]deg(v), which is proportional to the space required to store the vertices of Q using a sparse-graph layout. We show that a P-processor execution of PRISM performs updates in Q using O(Χ(lg (Q/Χ)+lgΔ)+ lgP) span and Θ(size(Q)+Χ+P) work. These theoretical guarantees are matched by good empirical performance. We modified GraphLab to incorporate PRISM and studied seven application benchmarks on a 12-core multicore machine. PRISM executes the benchmarks 1.2–2.1 times faster than GraphLab's nondeterministic lock-based scheduler while providing deterministic behavior. This paper also presents PRISM-R, a variation of PRISM that executes dynamic data-graph computations deterministically even when updates modify global variables with associative operations. PRISM-R satisfies the same theoretical bounds as PRISM, but its implementation is more involved, incorporating a multivector data structure to maintain an ordered set of vertices partitioned by color. National Science Foundation (U.S.) (Grant CNS-1017058) National Science Foundation (U.S.) (Grant CCF-1162148) National Science Foundation (U.S.) (Grant CCF-1314547) Intel Corporation Foxconn International Holdings Ltd. National Science Foundation (U.S.). Graduate Research Fellowship 2016-01-19T19:25:18Z 2016-01-19T19:25:18Z 2014-06 Article http://purl.org/eprint/type/ConferencePaper 9781450328210 http://hdl.handle.net/1721.1/100928 Tim Kaler, William Hasenplaugh, Tao B. Schardl, and Charles E. Leiserson. 2014. Executing dynamic data-graph computations deterministically using chromatic scheduling. In Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures (SPAA '14). ACM, New York, NY, USA, 154-165. https://orcid.org/0000-0001-6915-0216 https://orcid.org/0000-0002-3831-8255 https://orcid.org/0000-0003-0198-3283 en_US http://dx.doi.org/10.1145/2612669.2612673 Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures (SPAA '14) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery (ACM) MIT web domain
spellingShingle Leiserson, Charles E.
Kaler, Timothy
Hasenplaugh, William Cleaburn
Schardl, Tao Benjamin
Executing dynamic data-graph computations deterministically using chromatic scheduling
title Executing dynamic data-graph computations deterministically using chromatic scheduling
title_full Executing dynamic data-graph computations deterministically using chromatic scheduling
title_fullStr Executing dynamic data-graph computations deterministically using chromatic scheduling
title_full_unstemmed Executing dynamic data-graph computations deterministically using chromatic scheduling
title_short Executing dynamic data-graph computations deterministically using chromatic scheduling
title_sort executing dynamic data graph computations deterministically using chromatic scheduling
url http://hdl.handle.net/1721.1/100928
https://orcid.org/0000-0001-6915-0216
https://orcid.org/0000-0002-3831-8255
https://orcid.org/0000-0003-0198-3283
work_keys_str_mv AT leisersoncharlese executingdynamicdatagraphcomputationsdeterministicallyusingchromaticscheduling
AT kalertimothy executingdynamicdatagraphcomputationsdeterministicallyusingchromaticscheduling
AT hasenplaughwilliamcleaburn executingdynamicdatagraphcomputationsdeterministicallyusingchromaticscheduling
AT schardltaobenjamin executingdynamicdatagraphcomputationsdeterministicallyusingchromaticscheduling