Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions

Nowadays, computer vision relies heavily on convolutional neural networks (CNNs) to perform complex and accurate tasks. Among them, super-resolution CNNs represent a meaningful example, due to the presence of both convolutional (CONV) and transposed convolutional (TCONV) layers. While the former exp...

Full description

Bibliographic Details
Main Authors: Cristian Sestito, Fanny Spagnolo, Stefania Perri
Format: Article
Language:English
Published: MDPI AG 2021-10-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/7/10/210
_version_ 1797514230455336960
author Cristian Sestito
Fanny Spagnolo
Stefania Perri
author_facet Cristian Sestito
Fanny Spagnolo
Stefania Perri
author_sort Cristian Sestito
collection DOAJ
description Nowadays, computer vision relies heavily on convolutional neural networks (CNNs) to perform complex and accurate tasks. Among them, super-resolution CNNs represent a meaningful example, due to the presence of both convolutional (CONV) and transposed convolutional (TCONV) layers. While the former exploit multiply-and-accumulate (MAC) operations to extract features of interest from incoming feature maps (<i>fmaps</i>), the latter perform MACs to tune the spatial resolution of the received <i>fmaps</i> properly. The ever-growing real-time and low-power requirements of modern computer vision applications represent a stimulus for the research community to investigate the deployment of CNNs on well-suited hardware platforms, such as field programmable gate arrays (FPGAs). FPGAs are widely recognized as valid candidates for trading off computational speed and power consumption, thanks to their flexibility and their capability to also deal with computationally intensive models. In order to reduce the number of operations to be performed, this paper presents a novel hardware-oriented algorithm able to efficiently accelerate both CONVs and TCONVs. The proposed strategy was validated by employing it within a reconfigurable hardware accelerator purposely designed to adapt itself to different operating modes set at run-time. When characterized using the Xilinx XC7K410T FPGA device, the proposed accelerator achieved a throughput of up to 2022.2 GOPS and, in comparison to state-of-the-art competitors, it reached an energy efficiency up to 2.3 times higher, without compromising the overall accuracy.
first_indexed 2024-03-10T06:28:41Z
format Article
id doaj.art-35dd0e9e5d8942899c751db5a7ff4e6e
institution Directory Open Access Journal
issn 2313-433X
language English
last_indexed 2024-03-10T06:28:41Z
publishDate 2021-10-01
publisher MDPI AG
record_format Article
series Journal of Imaging
spelling doaj.art-35dd0e9e5d8942899c751db5a7ff4e6e2023-11-22T18:44:21ZengMDPI AGJournal of Imaging2313-433X2021-10-0171021010.3390/jimaging7100210Design of Flexible Hardware Accelerators for Image Convolutions and Transposed ConvolutionsCristian Sestito0Fanny Spagnolo1Stefania Perri2Department of Informatics, Modeling, Electronics and System Engineering, University of Calabria, 87036 Rende, ItalyDepartment of Informatics, Modeling, Electronics and System Engineering, University of Calabria, 87036 Rende, ItalyDepartment of Mechanical, Energy and Management Engineering, University of Calabria, 87036 Rende, ItalyNowadays, computer vision relies heavily on convolutional neural networks (CNNs) to perform complex and accurate tasks. Among them, super-resolution CNNs represent a meaningful example, due to the presence of both convolutional (CONV) and transposed convolutional (TCONV) layers. While the former exploit multiply-and-accumulate (MAC) operations to extract features of interest from incoming feature maps (<i>fmaps</i>), the latter perform MACs to tune the spatial resolution of the received <i>fmaps</i> properly. The ever-growing real-time and low-power requirements of modern computer vision applications represent a stimulus for the research community to investigate the deployment of CNNs on well-suited hardware platforms, such as field programmable gate arrays (FPGAs). FPGAs are widely recognized as valid candidates for trading off computational speed and power consumption, thanks to their flexibility and their capability to also deal with computationally intensive models. In order to reduce the number of operations to be performed, this paper presents a novel hardware-oriented algorithm able to efficiently accelerate both CONVs and TCONVs. The proposed strategy was validated by employing it within a reconfigurable hardware accelerator purposely designed to adapt itself to different operating modes set at run-time. When characterized using the Xilinx XC7K410T FPGA device, the proposed accelerator achieved a throughput of up to 2022.2 GOPS and, in comparison to state-of-the-art competitors, it reached an energy efficiency up to 2.3 times higher, without compromising the overall accuracy.https://www.mdpi.com/2313-433X/7/10/210hardware acceleratorsconvolutional neural networkstransposed convolutionsuper resolution imagingfield programmable gate array (FPGA)
spellingShingle Cristian Sestito
Fanny Spagnolo
Stefania Perri
Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions
Journal of Imaging
hardware accelerators
convolutional neural networks
transposed convolution
super resolution imaging
field programmable gate array (FPGA)
title Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions
title_full Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions
title_fullStr Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions
title_full_unstemmed Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions
title_short Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions
title_sort design of flexible hardware accelerators for image convolutions and transposed convolutions
topic hardware accelerators
convolutional neural networks
transposed convolution
super resolution imaging
field programmable gate array (FPGA)
url https://www.mdpi.com/2313-433X/7/10/210
work_keys_str_mv AT cristiansestito designofflexiblehardwareacceleratorsforimageconvolutionsandtransposedconvolutions
AT fannyspagnolo designofflexiblehardwareacceleratorsforimageconvolutionsandtransposedconvolutions
AT stefaniaperri designofflexiblehardwareacceleratorsforimageconvolutionsandtransposedconvolutions