A Low-Power Hardware Architecture for Real-Time CNN Computing

Convolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensiv...

Full description

Bibliographic Details
Main Authors:	Xinyu Liu, Chenhong Cao, Shengyu Duan
Format:	Article
Language:	English
Published:	MDPI AG 2023-02-01
Series:	Sensors
Subjects:	CNN hardware acceleration edge computing RTC
Online Access:	https://www.mdpi.com/1424-8220/23/4/2045

_version_	1797618295449321472
author	Xinyu Liu Chenhong Cao Shengyu Duan
author_facet	Xinyu Liu Chenhong Cao Shengyu Duan
author_sort	Xinyu Liu
collection	DOAJ
description	Convolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensive CNN models. In addition, for the edge applications with real-time requirements, such as real-time computing (RTC) systems, the computations need to be completed considering the required timing constraint, so it is more difficult to trade off between computational latency and power consumption. In this paper, we propose a low-power CNN accelerator for edge inference of RTC systems, where the computations are operated in a column-wise manner, to realize an immediate computation for the currently available input data. We observe that most computations of some CNN kernels in deep layers can be completed in multiple cycles, while not affecting the overall computational latency. Thus, we present a multi-cycle scheme to conduct the column-wise convolutional operations to reduce the hardware resource and power consumption. We present hardware architecture for the multi-cycle scheme as a domain-specific CNN architecture, which is then implemented in a 65 nm technology. We prove our proposed approach realizes up to 8.45%, 49.41% and 50.64% power reductions for LeNet, AlexNet and VGG16, respectively. The experimental results show that our approach tends to cause a larger power reduction for the CNN models with greater depth, larger kernels and more channels.
first_indexed	2024-03-11T08:11:05Z
format	Article
id	doaj.art-a8f4ae2be03b40148d29c0c51af15bc5
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T08:11:05Z
publishDate	2023-02-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-a8f4ae2be03b40148d29c0c51af15bc52023-11-16T23:09:35ZengMDPI AGSensors1424-82202023-02-01234204510.3390/s23042045A Low-Power Hardware Architecture for Real-Time CNN ComputingXinyu Liu0Chenhong Cao1Shengyu Duan2School of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaConvolutional neural network (CNN) is widely deployed on edge devices, performing tasks such as objective detection, image recognition and acoustic recognition. However, the limited resources and strict power constraints of edge devices pose a great challenge to applying the computationally intensive CNN models. In addition, for the edge applications with real-time requirements, such as real-time computing (RTC) systems, the computations need to be completed considering the required timing constraint, so it is more difficult to trade off between computational latency and power consumption. In this paper, we propose a low-power CNN accelerator for edge inference of RTC systems, where the computations are operated in a column-wise manner, to realize an immediate computation for the currently available input data. We observe that most computations of some CNN kernels in deep layers can be completed in multiple cycles, while not affecting the overall computational latency. Thus, we present a multi-cycle scheme to conduct the column-wise convolutional operations to reduce the hardware resource and power consumption. We present hardware architecture for the multi-cycle scheme as a domain-specific CNN architecture, which is then implemented in a 65 nm technology. We prove our proposed approach realizes up to 8.45%, 49.41% and 50.64% power reductions for LeNet, AlexNet and VGG16, respectively. The experimental results show that our approach tends to cause a larger power reduction for the CNN models with greater depth, larger kernels and more channels.https://www.mdpi.com/1424-8220/23/4/2045CNNhardware accelerationedge computingRTC
spellingShingle	Xinyu Liu Chenhong Cao Shengyu Duan A Low-Power Hardware Architecture for Real-Time CNN Computing Sensors CNN hardware acceleration edge computing RTC
title	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_full	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_fullStr	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_full_unstemmed	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_short	A Low-Power Hardware Architecture for Real-Time CNN Computing
title_sort	low power hardware architecture for real time cnn computing
topic	CNN hardware acceleration edge computing RTC
url	https://www.mdpi.com/1424-8220/23/4/2045
work_keys_str_mv	AT xinyuliu alowpowerhardwarearchitectureforrealtimecnncomputing AT chenhongcao alowpowerhardwarearchitectureforrealtimecnncomputing AT shengyuduan alowpowerhardwarearchitectureforrealtimecnncomputing AT xinyuliu lowpowerhardwarearchitectureforrealtimecnncomputing AT chenhongcao lowpowerhardwarearchitectureforrealtimecnncomputing AT shengyuduan lowpowerhardwarearchitectureforrealtimecnncomputing

A Low-Power Hardware Architecture for Real-Time CNN Computing

Similar Items