Design of FPGA Accelerator Architecture for Convolutional Neural Network
With the rapid development of artificial intelligence, convolutional neural networks (CNN) play an increasingly important role in many fields. In this paper, the existing convolutional neural network model is analyzed, and a convolutional neural network accelerator based on field-programmable gate a...
Main Author: | |
---|---|
Format: | Article |
Language: | zho |
Published: |
Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
2020-03-01
|
Series: | Jisuanji kexue yu tansuo |
Subjects: | |
Online Access: | http://fcst.ceaj.org/CN/abstract/abstract2137.shtml |
_version_ | 1818455404784386048 |
---|---|
author | LI Bingjian, QIN Guoxuan, ZHU Shaojie, PEI Zhihui |
author_facet | LI Bingjian, QIN Guoxuan, ZHU Shaojie, PEI Zhihui |
author_sort | LI Bingjian, QIN Guoxuan, ZHU Shaojie, PEI Zhihui |
collection | DOAJ |
description | With the rapid development of artificial intelligence, convolutional neural networks (CNN) play an increasingly important role in many fields. In this paper, the existing convolutional neural network model is analyzed, and a convolutional neural network accelerator based on field-programmable gate array (FPGA) is designed. In the convolution operation, the parallelization calculation is realized in four dimensions. A parametric architecture design is proposed. Under the three parameters, a single clock cycle can complete 512, 1024, 2048 multiply and accumulate respectively; the on-chip double buffer is designed. The structure reduces the off-chip storage access and realizes effective data multiplexing. The pipeline is used to implement a complete neural network single-layer operation process, which improves the operation efficiency. Compared with CPU, GPU and related FPGA acceleration schemes, the experimental results show that the speed of the design proposed by this paper is 560.2 GOP/s, which is 8.9 times that of the i7-6850K CPU. At the same time, calculated performance and power consumption ratio is 3.0 times that of NVDIA GTX 1080Ti GPU. Compared with related research, the accelerator designed achieves a high performance-to-power ratio in mainstream CNN network computing, and there is no lack of versatility. |
first_indexed | 2024-12-14T22:10:15Z |
format | Article |
id | doaj.art-be8d917519324aa5900fcf9dc4d673d1 |
institution | Directory Open Access Journal |
issn | 1673-9418 |
language | zho |
last_indexed | 2024-12-14T22:10:15Z |
publishDate | 2020-03-01 |
publisher | Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press |
record_format | Article |
series | Jisuanji kexue yu tansuo |
spelling | doaj.art-be8d917519324aa5900fcf9dc4d673d12022-12-21T22:45:46ZzhoJournal of Computer Engineering and Applications Beijing Co., Ltd., Science PressJisuanji kexue yu tansuo1673-94182020-03-0114343744810.3778/j.issn.1673-9418.1906042Design of FPGA Accelerator Architecture for Convolutional Neural NetworkLI Bingjian, QIN Guoxuan, ZHU Shaojie, PEI Zhihui0School of Microelectronics, Tianjin University, Tianjin 300072, ChinaWith the rapid development of artificial intelligence, convolutional neural networks (CNN) play an increasingly important role in many fields. In this paper, the existing convolutional neural network model is analyzed, and a convolutional neural network accelerator based on field-programmable gate array (FPGA) is designed. In the convolution operation, the parallelization calculation is realized in four dimensions. A parametric architecture design is proposed. Under the three parameters, a single clock cycle can complete 512, 1024, 2048 multiply and accumulate respectively; the on-chip double buffer is designed. The structure reduces the off-chip storage access and realizes effective data multiplexing. The pipeline is used to implement a complete neural network single-layer operation process, which improves the operation efficiency. Compared with CPU, GPU and related FPGA acceleration schemes, the experimental results show that the speed of the design proposed by this paper is 560.2 GOP/s, which is 8.9 times that of the i7-6850K CPU. At the same time, calculated performance and power consumption ratio is 3.0 times that of NVDIA GTX 1080Ti GPU. Compared with related research, the accelerator designed achieves a high performance-to-power ratio in mainstream CNN network computing, and there is no lack of versatility.http://fcst.ceaj.org/CN/abstract/abstract2137.shtmlhardware acceleratorfield-programmable gate array (fpga)convolutional neural network (cnn)parameterized architecturepipeline |
spellingShingle | LI Bingjian, QIN Guoxuan, ZHU Shaojie, PEI Zhihui Design of FPGA Accelerator Architecture for Convolutional Neural Network Jisuanji kexue yu tansuo hardware accelerator field-programmable gate array (fpga) convolutional neural network (cnn) parameterized architecture pipeline |
title | Design of FPGA Accelerator Architecture for Convolutional Neural Network |
title_full | Design of FPGA Accelerator Architecture for Convolutional Neural Network |
title_fullStr | Design of FPGA Accelerator Architecture for Convolutional Neural Network |
title_full_unstemmed | Design of FPGA Accelerator Architecture for Convolutional Neural Network |
title_short | Design of FPGA Accelerator Architecture for Convolutional Neural Network |
title_sort | design of fpga accelerator architecture for convolutional neural network |
topic | hardware accelerator field-programmable gate array (fpga) convolutional neural network (cnn) parameterized architecture pipeline |
url | http://fcst.ceaj.org/CN/abstract/abstract2137.shtml |
work_keys_str_mv | AT libingjianqinguoxuanzhushaojiepeizhihui designoffpgaacceleratorarchitectureforconvolutionalneuralnetwork |