FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization

The Graph Attention Networks (GATs) exhibit outstanding performance in multiple authoritative node classification benchmark tests (including transductive and inductive). The purpose of this research is to implement an FPGA-based accelerator called FPGAN for graph attention networks that achieves sig...

Full description

Bibliographic Details
Main Authors:	Weian Yan, Weiqin Tong, Xiaoli Zhi
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Graph attention networks model optimization inference accelerating field programmable gate array heterogeneous computing parallel computing
Online Access:	https://ieeexplore.ieee.org/document/9195849/

_version_	1818662421086076928
author	Weian Yan Weiqin Tong Xiaoli Zhi
author_facet	Weian Yan Weiqin Tong Xiaoli Zhi
author_sort	Weian Yan
collection	DOAJ
description	The Graph Attention Networks (GATs) exhibit outstanding performance in multiple authoritative node classification benchmark tests (including transductive and inductive). The purpose of this research is to implement an FPGA-based accelerator called FPGAN for graph attention networks that achieves significant improvement on performance and energy efficiency without losing accuracy compared with PyTorch baseline. It eliminates the dependence on digital signal processors (DSPs) and large amounts of on-chip memory and can even work well on low-end FPGA devices. We design FPGAN with software and hardware co-optimization across the full stack from algorithm through architecture. Specifically, we compress model to reduce the model size, quantify features to perform fixed-point calculation, replace multiplication addition cell (MAC) with shift addition units (SAUs) to eliminate the dependence on DSPs, and design an efficient algorithm to approximate SoftMax function. We also adjust the activation functions and fuse operations to further reduce the computation requirement. Moreover, all data is vectorized and aligned for scalable vector computation and efficient memory access. All the above optimizations are integrated into a universal hardware pipeline for various structures of GATs. We evaluate our design on an Inspur F10A board with an Intel Arria 10 GX1150 and 16 GB DDR3 memory. Experimental results show that FPGAN can achieve 7.34 times speedup over Nvidia Tesla V100 and 593 times over Xeon CPU Gold 5115 while maintaining accuracy, and 48 times and 2400 times on energy efficiency respectively.
first_indexed	2024-12-17T05:00:41Z
format	Article
id	doaj.art-4419b29948614f4795d247af3830d366
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-17T05:00:41Z
publishDate	2020-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-4419b29948614f4795d247af3830d3662022-12-21T22:02:34ZengIEEEIEEE Access2169-35362020-01-01817160817162010.1109/ACCESS.2020.30239469195849FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-OptimizationWeian Yan0https://orcid.org/0000-0001-7249-6883Weiqin Tong1https://orcid.org/0000-0001-8300-6376Xiaoli Zhi2https://orcid.org/0000-0002-0615-2051School of Computer Engineering and Science, Shanghai University, Shanghai, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai, ChinaThe Graph Attention Networks (GATs) exhibit outstanding performance in multiple authoritative node classification benchmark tests (including transductive and inductive). The purpose of this research is to implement an FPGA-based accelerator called FPGAN for graph attention networks that achieves significant improvement on performance and energy efficiency without losing accuracy compared with PyTorch baseline. It eliminates the dependence on digital signal processors (DSPs) and large amounts of on-chip memory and can even work well on low-end FPGA devices. We design FPGAN with software and hardware co-optimization across the full stack from algorithm through architecture. Specifically, we compress model to reduce the model size, quantify features to perform fixed-point calculation, replace multiplication addition cell (MAC) with shift addition units (SAUs) to eliminate the dependence on DSPs, and design an efficient algorithm to approximate SoftMax function. We also adjust the activation functions and fuse operations to further reduce the computation requirement. Moreover, all data is vectorized and aligned for scalable vector computation and efficient memory access. All the above optimizations are integrated into a universal hardware pipeline for various structures of GATs. We evaluate our design on an Inspur F10A board with an Intel Arria 10 GX1150 and 16 GB DDR3 memory. Experimental results show that FPGAN can achieve 7.34 times speedup over Nvidia Tesla V100 and 593 times over Xeon CPU Gold 5115 while maintaining accuracy, and 48 times and 2400 times on energy efficiency respectively.https://ieeexplore.ieee.org/document/9195849/Graph attention networksmodel optimizationinference acceleratingfield programmable gate arrayheterogeneous computingparallel computing
spellingShingle	Weian Yan Weiqin Tong Xiaoli Zhi FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization IEEE Access Graph attention networks model optimization inference accelerating field programmable gate array heterogeneous computing parallel computing
title	FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization
title_full	FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization
title_fullStr	FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization
title_full_unstemmed	FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization
title_short	FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization
title_sort	fpgan an fpga accelerator for graph attention networks with software and hardware co optimization
topic	Graph attention networks model optimization inference accelerating field programmable gate array heterogeneous computing parallel computing
url	https://ieeexplore.ieee.org/document/9195849/
work_keys_str_mv	AT weianyan fpgananfpgaacceleratorforgraphattentionnetworkswithsoftwareandhardwarecooptimization AT weiqintong fpgananfpgaacceleratorforgraphattentionnetworkswithsoftwareandhardwarecooptimization AT xiaolizhi fpgananfpgaacceleratorforgraphattentionnetworkswithsoftwareandhardwarecooptimization

FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization

Similar Items