GraphMMU: Memory Management Unit for Sparse Graph Accelerators

Memory management units that use low-level AXI descriptor chains to hold irregular graph-oriented access sequences can help improve DRAM memory throughput of graph algorithms by almost an order of magnitude. For the Xilinx Zed board, we explore and compare the memory throughputs achievable when usin...

Full description

Bibliographic Details
Main Authors:	Han, Jianglei, Kapre, Nachiket, Bean, Andrew, Moorthy, Pradeep, Siddhartha
Other Authors:	School of Computer Engineering
Format:	Conference Paper
Language:	English
Published:	2015
Subjects:	Computer Science and Engineering
Online Access:	https://hdl.handle.net/10356/81201 http://hdl.handle.net/10220/39176

_version_	1826117703804387328
author	Han, Jianglei Kapre, Nachiket Bean, Andrew Moorthy, Pradeep Siddhartha
author2	School of Computer Engineering
author_facet	School of Computer Engineering Han, Jianglei Kapre, Nachiket Bean, Andrew Moorthy, Pradeep Siddhartha
author_sort	Han, Jianglei
collection	NTU
description	Memory management units that use low-level AXI descriptor chains to hold irregular graph-oriented access sequences can help improve DRAM memory throughput of graph algorithms by almost an order of magnitude. For the Xilinx Zed board, we explore and compare the memory throughputs achievable when using (1) cache-enabled CPUs with an OS, (2) cache-enabled CPUs running bare metal code, (2) CPU-based control of FPGA-based AXI DMAs, and finally (3) local FPGA-based control of AXI DMA transfers. For short-burst irregular traffic generated from sparse graph access patterns, we observe a performance penalty of almost 10× due to DRAM row activations when compared to cache-friendly sequential access. When using an AXI DMA engine configured in FPGA logic and programmed in AXI register mode from the CPU, we can improve DRAM performance by as much as 2.4× over naïve random access on the CPU. In this mode, we use the host CPU to trigger DMA transfer by writing appropriate control information in the internal register of the DMA engine. We also encode the sparse graph access patterns as locally-stored BRAM-hosted AXI descriptor chains to drive the AXI DMA engines with minimal CPU involvement under Scatter Gather mode. In this configuration, we deliver an additional 3× speedup, for a cumulative throughput improvement of 7× over a CPU-based approach using caches while running an OS to manage irregular access.
first_indexed	2024-10-01T04:31:56Z
format	Conference Paper
id	ntu-10356/81201
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T04:31:56Z
publishDate	2015
record_format	dspace
spelling	ntu-10356/812012020-05-28T07:17:49Z GraphMMU: Memory Management Unit for Sparse Graph Accelerators Han, Jianglei Kapre, Nachiket Bean, Andrew Moorthy, Pradeep Siddhartha School of Computer Engineering 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW) Computer Science and Engineering Memory management units that use low-level AXI descriptor chains to hold irregular graph-oriented access sequences can help improve DRAM memory throughput of graph algorithms by almost an order of magnitude. For the Xilinx Zed board, we explore and compare the memory throughputs achievable when using (1) cache-enabled CPUs with an OS, (2) cache-enabled CPUs running bare metal code, (2) CPU-based control of FPGA-based AXI DMAs, and finally (3) local FPGA-based control of AXI DMA transfers. For short-burst irregular traffic generated from sparse graph access patterns, we observe a performance penalty of almost 10× due to DRAM row activations when compared to cache-friendly sequential access. When using an AXI DMA engine configured in FPGA logic and programmed in AXI register mode from the CPU, we can improve DRAM performance by as much as 2.4× over naïve random access on the CPU. In this mode, we use the host CPU to trigger DMA transfer by writing appropriate control information in the internal register of the DMA engine. We also encode the sparse graph access patterns as locally-stored BRAM-hosted AXI descriptor chains to drive the AXI DMA engines with minimal CPU involvement under Scatter Gather mode. In this configuration, we deliver an additional 3× speedup, for a cumulative throughput improvement of 7× over a CPU-based approach using caches while running an OS to manage irregular access. Accepted version 2015-12-18T08:31:41Z 2019-12-06T14:23:30Z 2015-12-18T08:31:41Z 2019-12-06T14:23:30Z 2015 Conference Paper Kapre, N., Jianglei, H., Bean, A., Moorthy, P., & Siddhartha (2015). GraphMMU: Memory Management Unit for Sparse Graph Accelerators. 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 113-120. https://hdl.handle.net/10356/81201 http://hdl.handle.net/10220/39176 10.1109/IPDPSW.2015.101 en © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/IPDPSW.2015.101]. 8 p. application/pdf
spellingShingle	Computer Science and Engineering Han, Jianglei Kapre, Nachiket Bean, Andrew Moorthy, Pradeep Siddhartha GraphMMU: Memory Management Unit for Sparse Graph Accelerators
title	GraphMMU: Memory Management Unit for Sparse Graph Accelerators
title_full	GraphMMU: Memory Management Unit for Sparse Graph Accelerators
title_fullStr	GraphMMU: Memory Management Unit for Sparse Graph Accelerators
title_full_unstemmed	GraphMMU: Memory Management Unit for Sparse Graph Accelerators
title_short	GraphMMU: Memory Management Unit for Sparse Graph Accelerators
title_sort	graphmmu memory management unit for sparse graph accelerators
topic	Computer Science and Engineering
url	https://hdl.handle.net/10356/81201 http://hdl.handle.net/10220/39176
work_keys_str_mv	AT hanjianglei graphmmumemorymanagementunitforsparsegraphaccelerators AT kaprenachiket graphmmumemorymanagementunitforsparsegraphaccelerators AT beanandrew graphmmumemorymanagementunitforsparsegraphaccelerators AT moorthypradeep graphmmumemorymanagementunitforsparsegraphaccelerators AT siddhartha graphmmumemorymanagementunitforsparsegraphaccelerators

GraphMMU: Memory Management Unit for Sparse Graph Accelerators

Similar Items