Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions
Nowadays, computer vision relies heavily on convolutional neural networks (CNNs) to perform complex and accurate tasks. Among them, super-resolution CNNs represent a meaningful example, due to the presence of both convolutional (CONV) and transposed convolutional (TCONV) layers. While the former exp...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-10-01
|
Series: | Journal of Imaging |
Subjects: | |
Online Access: | https://www.mdpi.com/2313-433X/7/10/210 |
_version_ | 1797514230455336960 |
---|---|
author | Cristian Sestito Fanny Spagnolo Stefania Perri |
author_facet | Cristian Sestito Fanny Spagnolo Stefania Perri |
author_sort | Cristian Sestito |
collection | DOAJ |
description | Nowadays, computer vision relies heavily on convolutional neural networks (CNNs) to perform complex and accurate tasks. Among them, super-resolution CNNs represent a meaningful example, due to the presence of both convolutional (CONV) and transposed convolutional (TCONV) layers. While the former exploit multiply-and-accumulate (MAC) operations to extract features of interest from incoming feature maps (<i>fmaps</i>), the latter perform MACs to tune the spatial resolution of the received <i>fmaps</i> properly. The ever-growing real-time and low-power requirements of modern computer vision applications represent a stimulus for the research community to investigate the deployment of CNNs on well-suited hardware platforms, such as field programmable gate arrays (FPGAs). FPGAs are widely recognized as valid candidates for trading off computational speed and power consumption, thanks to their flexibility and their capability to also deal with computationally intensive models. In order to reduce the number of operations to be performed, this paper presents a novel hardware-oriented algorithm able to efficiently accelerate both CONVs and TCONVs. The proposed strategy was validated by employing it within a reconfigurable hardware accelerator purposely designed to adapt itself to different operating modes set at run-time. When characterized using the Xilinx XC7K410T FPGA device, the proposed accelerator achieved a throughput of up to 2022.2 GOPS and, in comparison to state-of-the-art competitors, it reached an energy efficiency up to 2.3 times higher, without compromising the overall accuracy. |
first_indexed | 2024-03-10T06:28:41Z |
format | Article |
id | doaj.art-35dd0e9e5d8942899c751db5a7ff4e6e |
institution | Directory Open Access Journal |
issn | 2313-433X |
language | English |
last_indexed | 2024-03-10T06:28:41Z |
publishDate | 2021-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Journal of Imaging |
spelling | doaj.art-35dd0e9e5d8942899c751db5a7ff4e6e2023-11-22T18:44:21ZengMDPI AGJournal of Imaging2313-433X2021-10-0171021010.3390/jimaging7100210Design of Flexible Hardware Accelerators for Image Convolutions and Transposed ConvolutionsCristian Sestito0Fanny Spagnolo1Stefania Perri2Department of Informatics, Modeling, Electronics and System Engineering, University of Calabria, 87036 Rende, ItalyDepartment of Informatics, Modeling, Electronics and System Engineering, University of Calabria, 87036 Rende, ItalyDepartment of Mechanical, Energy and Management Engineering, University of Calabria, 87036 Rende, ItalyNowadays, computer vision relies heavily on convolutional neural networks (CNNs) to perform complex and accurate tasks. Among them, super-resolution CNNs represent a meaningful example, due to the presence of both convolutional (CONV) and transposed convolutional (TCONV) layers. While the former exploit multiply-and-accumulate (MAC) operations to extract features of interest from incoming feature maps (<i>fmaps</i>), the latter perform MACs to tune the spatial resolution of the received <i>fmaps</i> properly. The ever-growing real-time and low-power requirements of modern computer vision applications represent a stimulus for the research community to investigate the deployment of CNNs on well-suited hardware platforms, such as field programmable gate arrays (FPGAs). FPGAs are widely recognized as valid candidates for trading off computational speed and power consumption, thanks to their flexibility and their capability to also deal with computationally intensive models. In order to reduce the number of operations to be performed, this paper presents a novel hardware-oriented algorithm able to efficiently accelerate both CONVs and TCONVs. The proposed strategy was validated by employing it within a reconfigurable hardware accelerator purposely designed to adapt itself to different operating modes set at run-time. When characterized using the Xilinx XC7K410T FPGA device, the proposed accelerator achieved a throughput of up to 2022.2 GOPS and, in comparison to state-of-the-art competitors, it reached an energy efficiency up to 2.3 times higher, without compromising the overall accuracy.https://www.mdpi.com/2313-433X/7/10/210hardware acceleratorsconvolutional neural networkstransposed convolutionsuper resolution imagingfield programmable gate array (FPGA) |
spellingShingle | Cristian Sestito Fanny Spagnolo Stefania Perri Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions Journal of Imaging hardware accelerators convolutional neural networks transposed convolution super resolution imaging field programmable gate array (FPGA) |
title | Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions |
title_full | Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions |
title_fullStr | Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions |
title_full_unstemmed | Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions |
title_short | Design of Flexible Hardware Accelerators for Image Convolutions and Transposed Convolutions |
title_sort | design of flexible hardware accelerators for image convolutions and transposed convolutions |
topic | hardware accelerators convolutional neural networks transposed convolution super resolution imaging field programmable gate array (FPGA) |
url | https://www.mdpi.com/2313-433X/7/10/210 |
work_keys_str_mv | AT cristiansestito designofflexiblehardwareacceleratorsforimageconvolutionsandtransposedconvolutions AT fannyspagnolo designofflexiblehardwareacceleratorsforimageconvolutionsandtransposedconvolutions AT stefaniaperri designofflexiblehardwareacceleratorsforimageconvolutionsandtransposedconvolutions |