Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices

This paper presents a hardware management technique that enables energy-efficient acceleration of deep neural networks (DNNs) on realtime-constrained embedded edge devices. It becomes increasingly common for edge devices to incorporate dedicated hardware accelerators for neural processing. The execu...

Full description

Bibliographic Details
Main Authors:	Bogil Kim, Sungjae Lee, Amit Ranjan Trivedi, William J. Song
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Deep neural networks heterogeneous computing embedded processors hardware management hardware measurement energy efficiency
Online Access:	https://ieeexplore.ieee.org/document/9262933/

_version_	1819296247079501824
author	Bogil Kim Sungjae Lee Amit Ranjan Trivedi William J. Song
author_facet	Bogil Kim Sungjae Lee Amit Ranjan Trivedi William J. Song
author_sort	Bogil Kim
collection	DOAJ
description	This paper presents a hardware management technique that enables energy-efficient acceleration of deep neural networks (DNNs) on realtime-constrained embedded edge devices. It becomes increasingly common for edge devices to incorporate dedicated hardware accelerators for neural processing. The execution of neural accelerators in general follows a host-device model, where CPUs offload neural computations (e.g., matrix and vector calculations) to the accelerators for datapath-optimized executions. Such a serialized execution is simple to implement and manage, but it is wasteful for the resource-limited edge devices to exercise only a single type of processing unit in a discrete execution phase. This paper presents a hardware management technique named NeuroPipe that utilizes heterogeneous processing units in an embedded edge device to accelerate DNNs in energy-efficient manner. In particular, NeuroPipe splits a neural network into groups of consecutive layers and pipelines their executions using different types of processing units. The proposed technique offers several advantages to accelerate DNN inference in the embedded edge device. It enables the embedded processor to operate at lower voltage and frequency to enhance energy efficiency while delivering the same performance as uncontrolled baseline executions, or inversely it can dispatch faster inferences at the same energy consumption. Our measurement-driven experiments based on NVIDIA Jetson AGX Xavier with 64 tensor cores and eight-core ARM CPU demonstrate that NeuroPipe reduces energy consumption by 11.4% on average without performance degradation, or it can achieve 30.5% greater performance for the same energy consumption.
first_indexed	2024-12-24T04:55:04Z
format	Article
id	doaj.art-90ea5134965d486f9679abdf6026d4b2
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-24T04:55:04Z
publishDate	2020-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-90ea5134965d486f9679abdf6026d4b22022-12-21T17:14:24ZengIEEEIEEE Access2169-35362020-01-01821625921627010.1109/ACCESS.2020.30389089262933Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge DevicesBogil Kim0https://orcid.org/0000-0002-2332-7933Sungjae Lee1Amit Ranjan Trivedi2William J. Song3https://orcid.org/0000-0001-9170-5986School of Electrical and Electronic Engineering, Yonsei University, Seoul, South KoreaSchool of Electrical and Electronic Engineering, Yonsei University, Seoul, South KoreaDepartment of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, IL, USASchool of Electrical and Electronic Engineering, Yonsei University, Seoul, South KoreaThis paper presents a hardware management technique that enables energy-efficient acceleration of deep neural networks (DNNs) on realtime-constrained embedded edge devices. It becomes increasingly common for edge devices to incorporate dedicated hardware accelerators for neural processing. The execution of neural accelerators in general follows a host-device model, where CPUs offload neural computations (e.g., matrix and vector calculations) to the accelerators for datapath-optimized executions. Such a serialized execution is simple to implement and manage, but it is wasteful for the resource-limited edge devices to exercise only a single type of processing unit in a discrete execution phase. This paper presents a hardware management technique named NeuroPipe that utilizes heterogeneous processing units in an embedded edge device to accelerate DNNs in energy-efficient manner. In particular, NeuroPipe splits a neural network into groups of consecutive layers and pipelines their executions using different types of processing units. The proposed technique offers several advantages to accelerate DNN inference in the embedded edge device. It enables the embedded processor to operate at lower voltage and frequency to enhance energy efficiency while delivering the same performance as uncontrolled baseline executions, or inversely it can dispatch faster inferences at the same energy consumption. Our measurement-driven experiments based on NVIDIA Jetson AGX Xavier with 64 tensor cores and eight-core ARM CPU demonstrate that NeuroPipe reduces energy consumption by 11.4% on average without performance degradation, or it can achieve 30.5% greater performance for the same energy consumption.https://ieeexplore.ieee.org/document/9262933/Deep neural networksheterogeneous computingembedded processorshardware managementhardware measurementenergy efficiency
spellingShingle	Bogil Kim Sungjae Lee Amit Ranjan Trivedi William J. Song Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices IEEE Access Deep neural networks heterogeneous computing embedded processors hardware management hardware measurement energy efficiency
title	Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices
title_full	Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices
title_fullStr	Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices
title_full_unstemmed	Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices
title_short	Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices
title_sort	energy efficient acceleration of deep neural networks on realtime constrained embedded edge devices
topic	Deep neural networks heterogeneous computing embedded processors hardware management hardware measurement energy efficiency
url	https://ieeexplore.ieee.org/document/9262933/
work_keys_str_mv	AT bogilkim energyefficientaccelerationofdeepneuralnetworksonrealtimeconstrainedembeddededgedevices AT sungjaelee energyefficientaccelerationofdeepneuralnetworksonrealtimeconstrainedembeddededgedevices AT amitranjantrivedi energyefficientaccelerationofdeepneuralnetworksonrealtimeconstrainedembeddededgedevices AT williamjsong energyefficientaccelerationofdeepneuralnetworksonrealtimeconstrainedembeddededgedevices

Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices

Similar Items