A Hardware Accelerator for the Inference of a Convolutional Neural network

Convolutional Neural Networks (CNNs) are becoming increasingly popular in deep learning applications, e.g. image classification, speech recognition, medicine, to name a few. However, the CNN inference is computationally intensive and demanding a large among of memory resources. In this work is prop...

Full description

Bibliographic Details
Main Authors: Edwin González, Walter D. Villamizar Luna, Carlos Augusto Fajardo Ariza
Format: Article
Language:English
Published: Editorial Neogranadina 2019-11-01
Series:Ciencia e Ingeniería Neogranadina
Subjects:
Online Access:https://revistas.unimilitar.edu.co/index.php/rcin/article/view/4194
Description
Summary:Convolutional Neural Networks (CNNs) are becoming increasingly popular in deep learning applications, e.g. image classification, speech recognition, medicine, to name a few. However, the CNN inference is computationally intensive and demanding a large among of memory resources. In this work is proposed a CNN inference hardware accelerator, which was implemented in a co-processing scheme. The aim is to reduce the hardware resources and achieve the better possible throughput. The design was implemented in the Digilent Arty Z7-20 development board, which is based on System on Chip (SoC) Zynq-7000 of Xilinx. Our implementation achieved a  of accuracy for the MNIST database using only 12-bits fixed-point format. The results show that the co-processing scheme operating at a conservative speed of 100 MHz can identify around 441 images per second, which is about 17% times faster than a 650 MHz - software implementation. It is difficult to compare our results against other implementations based on Field-Programmable Gate Array (FPGA), because the others implementations are not exactly like ours. However, some comparisons, regarding the logical resources used and accuracy, suggest that our work could be better than previous works.
ISSN:0124-8170
1909-7735