A layer-wise frequency scaling for a neural processing unit

Dynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power...

Full description

Bibliographic Details
Main Authors:	Jaehoon Chung, HyunMi Kim, Kyoungseon Shin, Chun-Gi Lyuh, Yong Cheol Peter Cho, Jinho Han, Youngsu Kwon, Young-Ho Gong, Sung Woo Chung
Format:	Article
Language:	English
Published:	Electronics and Telecommunications Research Institute (ETRI) 2022-10-01
Series:	ETRI Journal
Subjects:	dynamic frequency scaling neural processing unit power model
Online Access:	https://doi.org/10.4218/etrij.2022-0094

_version_	1811223736820432896
author	Jaehoon Chung HyunMi Kim Kyoungseon Shin Chun-Gi Lyuh Yong Cheol Peter Cho Jinho Han Youngsu Kwon Young-Ho Gong Sung Woo Chung
author_facet	Jaehoon Chung HyunMi Kim Kyoungseon Shin Chun-Gi Lyuh Yong Cheol Peter Cho Jinho Han Youngsu Kwon Young-Ho Gong Sung Woo Chung
author_sort	Jaehoon Chung
collection	DOAJ
description	Dynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power behavior and performance of each layer. Unfortunately, DVFS is inappropriate for layer-wise run-time power management of NPUs due to the long latency of voltage scaling compared with each layer execution time. Because the frequency scaling is fast enough to keep up with each layer, we propose a layerwise dynamic frequency scaling (DFS) technique for an NPU. Our proposed DFS exploits the highest frequency under the power limit of an NPU for each layer. To determine the highest allowable frequency, we build a power model to predict the power consumption of an NPU based on a real measurement on the fabricated NPU. Our evaluation results show that our proposed DFS improves frame per second (FPS) by 33% and saves energy by 14% on average, compared with DVFS.
first_indexed	2024-04-12T08:37:25Z
format	Article
id	doaj.art-7019db6b66c447488c26803a9450cb76
institution	Directory Open Access Journal
issn	1225-6463
language	English
last_indexed	2024-04-12T08:37:25Z
publishDate	2022-10-01
publisher	Electronics and Telecommunications Research Institute (ETRI)
record_format	Article
series	ETRI Journal
spelling	doaj.art-7019db6b66c447488c26803a9450cb762022-12-22T03:39:59ZengElectronics and Telecommunications Research Institute (ETRI)ETRI Journal1225-64632022-10-0144584985810.4218/etrij.2022-009410.4218/etrij.2022-0094A layer-wise frequency scaling for a neural processing unitJaehoon ChungHyunMi KimKyoungseon ShinChun-Gi LyuhYong Cheol Peter ChoJinho HanYoungsu KwonYoung-Ho GongSung Woo ChungDynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power behavior and performance of each layer. Unfortunately, DVFS is inappropriate for layer-wise run-time power management of NPUs due to the long latency of voltage scaling compared with each layer execution time. Because the frequency scaling is fast enough to keep up with each layer, we propose a layerwise dynamic frequency scaling (DFS) technique for an NPU. Our proposed DFS exploits the highest frequency under the power limit of an NPU for each layer. To determine the highest allowable frequency, we build a power model to predict the power consumption of an NPU based on a real measurement on the fabricated NPU. Our evaluation results show that our proposed DFS improves frame per second (FPS) by 33% and saves energy by 14% on average, compared with DVFS.https://doi.org/10.4218/etrij.2022-0094dynamic frequency scalingneural processing unitpower model
spellingShingle	Jaehoon Chung HyunMi Kim Kyoungseon Shin Chun-Gi Lyuh Yong Cheol Peter Cho Jinho Han Youngsu Kwon Young-Ho Gong Sung Woo Chung A layer-wise frequency scaling for a neural processing unit ETRI Journal dynamic frequency scaling neural processing unit power model
title	A layer-wise frequency scaling for a neural processing unit
title_full	A layer-wise frequency scaling for a neural processing unit
title_fullStr	A layer-wise frequency scaling for a neural processing unit
title_full_unstemmed	A layer-wise frequency scaling for a neural processing unit
title_short	A layer-wise frequency scaling for a neural processing unit
title_sort	layer wise frequency scaling for a neural processing unit
topic	dynamic frequency scaling neural processing unit power model
url	https://doi.org/10.4218/etrij.2022-0094
work_keys_str_mv	AT jaehoonchung alayerwisefrequencyscalingforaneuralprocessingunit AT hyunmikim alayerwisefrequencyscalingforaneuralprocessingunit AT kyoungseonshin alayerwisefrequencyscalingforaneuralprocessingunit AT chungilyuh alayerwisefrequencyscalingforaneuralprocessingunit AT yongcheolpetercho alayerwisefrequencyscalingforaneuralprocessingunit AT jinhohan alayerwisefrequencyscalingforaneuralprocessingunit AT youngsukwon alayerwisefrequencyscalingforaneuralprocessingunit AT younghogong alayerwisefrequencyscalingforaneuralprocessingunit AT sungwoochung alayerwisefrequencyscalingforaneuralprocessingunit AT jaehoonchung layerwisefrequencyscalingforaneuralprocessingunit AT hyunmikim layerwisefrequencyscalingforaneuralprocessingunit AT kyoungseonshin layerwisefrequencyscalingforaneuralprocessingunit AT chungilyuh layerwisefrequencyscalingforaneuralprocessingunit AT yongcheolpetercho layerwisefrequencyscalingforaneuralprocessingunit AT jinhohan layerwisefrequencyscalingforaneuralprocessingunit AT youngsukwon layerwisefrequencyscalingforaneuralprocessingunit AT younghogong layerwisefrequencyscalingforaneuralprocessingunit AT sungwoochung layerwisefrequencyscalingforaneuralprocessingunit

A layer-wise frequency scaling for a neural processing unit

Similar Items