Making caches work for graph analytics

© 2017 IEEE. Large-scale applications implemented in today's high performance graph frameworks heavily underutilize modern hardware systems. While many graph frameworks have made substantial progress in optimizing these applications, we show that it is still possible to achieve up to 5× speedup...

Full description

Bibliographic Details
Main Authors: Zhang, Yunming, Kiriansky, Vladimir, Mendis, Charith, Amarasinghe, Saman, Zaharia, Matei
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: IEEE 2021
Online Access:https://hdl.handle.net/1721.1/137371
_version_ 1826194090559012864
author Zhang, Yunming
Kiriansky, Vladimir
Mendis, Charith
Amarasinghe, Saman
Zaharia, Matei
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Zhang, Yunming
Kiriansky, Vladimir
Mendis, Charith
Amarasinghe, Saman
Zaharia, Matei
author_sort Zhang, Yunming
collection MIT
description © 2017 IEEE. Large-scale applications implemented in today's high performance graph frameworks heavily underutilize modern hardware systems. While many graph frameworks have made substantial progress in optimizing these applications, we show that it is still possible to achieve up to 5× speedups over the fastest frameworks by greatly improving cache utilization. Previous systems have applied out-of-core processing techniques from the memory/disk boundary to the cache/DRAM boundary. However, we find that blindly applying such techniques is ineffective because the much smaller performance gap between cache and DRAM requires new designs for achieving scalable performance and low overhead. We present Cagra, a cache optimized inmemory graph framework. Cagra uses a novel technique, CSR Segmenting, to break the vertices into segments that fit in last level cache, and partitions the graph into subgraphs based on the segments. Random accesses in each subgraph are limited to one segment at a time, eliminating the much slower random accesses to DRAM. The intermediate updates from each subgraph are written into buffers sequentially and later merged using a low overhead parallel cache-aware merge. Cagra achieves speedups of up to 5× for PageRank, Collaborative Filtering, Label Propagation and Betweenness Centrality over the best published results from state-of-the-art graph frameworks, including GraphMat, Ligra and GridGraph.
first_indexed 2024-09-23T09:50:00Z
format Article
id mit-1721.1/137371
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T09:50:00Z
publishDate 2021
publisher IEEE
record_format dspace
spelling mit-1721.1/1373712023-02-01T21:28:50Z Making caches work for graph analytics Zhang, Yunming Kiriansky, Vladimir Mendis, Charith Amarasinghe, Saman Zaharia, Matei Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory © 2017 IEEE. Large-scale applications implemented in today's high performance graph frameworks heavily underutilize modern hardware systems. While many graph frameworks have made substantial progress in optimizing these applications, we show that it is still possible to achieve up to 5× speedups over the fastest frameworks by greatly improving cache utilization. Previous systems have applied out-of-core processing techniques from the memory/disk boundary to the cache/DRAM boundary. However, we find that blindly applying such techniques is ineffective because the much smaller performance gap between cache and DRAM requires new designs for achieving scalable performance and low overhead. We present Cagra, a cache optimized inmemory graph framework. Cagra uses a novel technique, CSR Segmenting, to break the vertices into segments that fit in last level cache, and partitions the graph into subgraphs based on the segments. Random accesses in each subgraph are limited to one segment at a time, eliminating the much slower random accesses to DRAM. The intermediate updates from each subgraph are written into buffers sequentially and later merged using a low overhead parallel cache-aware merge. Cagra achieves speedups of up to 5× for PageRank, Collaborative Filtering, Label Propagation and Betweenness Centrality over the best published results from state-of-the-art graph frameworks, including GraphMat, Ligra and GridGraph. 2021-11-04T16:51:17Z 2021-11-04T16:51:17Z 2017-12 2019-05-02T17:13:33Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137371 Zhang, Yunming, Kiriansky, Vladimir, Mendis, Charith, Amarasinghe, Saman and Zaharia, Matei. 2017. "Making caches work for graph analytics." en 10.1109/bigdata.2017.8257937 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf IEEE other univ website
spellingShingle Zhang, Yunming
Kiriansky, Vladimir
Mendis, Charith
Amarasinghe, Saman
Zaharia, Matei
Making caches work for graph analytics
title Making caches work for graph analytics
title_full Making caches work for graph analytics
title_fullStr Making caches work for graph analytics
title_full_unstemmed Making caches work for graph analytics
title_short Making caches work for graph analytics
title_sort making caches work for graph analytics
url https://hdl.handle.net/1721.1/137371
work_keys_str_mv AT zhangyunming makingcachesworkforgraphanalytics
AT kirianskyvladimir makingcachesworkforgraphanalytics
AT mendischarith makingcachesworkforgraphanalytics
AT amarasinghesaman makingcachesworkforgraphanalytics
AT zahariamatei makingcachesworkforgraphanalytics