A layer-wise frequency scaling for a neural processing unit

Dynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power...

Full description

Bibliographic Details
Main Authors: Jaehoon Chung, HyunMi Kim, Kyoungseon Shin, Chun-Gi Lyuh, Yong Cheol Peter Cho, Jinho Han, Youngsu Kwon, Young-Ho Gong, Sung Woo Chung
Format: Article
Language:English
Published: Electronics and Telecommunications Research Institute (ETRI) 2022-10-01
Series:ETRI Journal
Subjects:
Online Access:https://doi.org/10.4218/etrij.2022-0094
_version_ 1811223736820432896
author Jaehoon Chung
HyunMi Kim
Kyoungseon Shin
Chun-Gi Lyuh
Yong Cheol Peter Cho
Jinho Han
Youngsu Kwon
Young-Ho Gong
Sung Woo Chung
author_facet Jaehoon Chung
HyunMi Kim
Kyoungseon Shin
Chun-Gi Lyuh
Yong Cheol Peter Cho
Jinho Han
Youngsu Kwon
Young-Ho Gong
Sung Woo Chung
author_sort Jaehoon Chung
collection DOAJ
description Dynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power behavior and performance of each layer. Unfortunately, DVFS is inappropriate for layer-wise run-time power management of NPUs due to the long latency of voltage scaling compared with each layer execution time. Because the frequency scaling is fast enough to keep up with each layer, we propose a layerwise dynamic frequency scaling (DFS) technique for an NPU. Our proposed DFS exploits the highest frequency under the power limit of an NPU for each layer. To determine the highest allowable frequency, we build a power model to predict the power consumption of an NPU based on a real measurement on the fabricated NPU. Our evaluation results show that our proposed DFS improves frame per second (FPS) by 33% and saves energy by 14% on average, compared with DVFS.
first_indexed 2024-04-12T08:37:25Z
format Article
id doaj.art-7019db6b66c447488c26803a9450cb76
institution Directory Open Access Journal
issn 1225-6463
language English
last_indexed 2024-04-12T08:37:25Z
publishDate 2022-10-01
publisher Electronics and Telecommunications Research Institute (ETRI)
record_format Article
series ETRI Journal
spelling doaj.art-7019db6b66c447488c26803a9450cb762022-12-22T03:39:59ZengElectronics and Telecommunications Research Institute (ETRI)ETRI Journal1225-64632022-10-0144584985810.4218/etrij.2022-009410.4218/etrij.2022-0094A layer-wise frequency scaling for a neural processing unitJaehoon ChungHyunMi KimKyoungseon ShinChun-Gi LyuhYong Cheol Peter ChoJinho HanYoungsu KwonYoung-Ho GongSung Woo ChungDynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power behavior and performance of each layer. Unfortunately, DVFS is inappropriate for layer-wise run-time power management of NPUs due to the long latency of voltage scaling compared with each layer execution time. Because the frequency scaling is fast enough to keep up with each layer, we propose a layerwise dynamic frequency scaling (DFS) technique for an NPU. Our proposed DFS exploits the highest frequency under the power limit of an NPU for each layer. To determine the highest allowable frequency, we build a power model to predict the power consumption of an NPU based on a real measurement on the fabricated NPU. Our evaluation results show that our proposed DFS improves frame per second (FPS) by 33% and saves energy by 14% on average, compared with DVFS.https://doi.org/10.4218/etrij.2022-0094dynamic frequency scalingneural processing unitpower model
spellingShingle Jaehoon Chung
HyunMi Kim
Kyoungseon Shin
Chun-Gi Lyuh
Yong Cheol Peter Cho
Jinho Han
Youngsu Kwon
Young-Ho Gong
Sung Woo Chung
A layer-wise frequency scaling for a neural processing unit
ETRI Journal
dynamic frequency scaling
neural processing unit
power model
title A layer-wise frequency scaling for a neural processing unit
title_full A layer-wise frequency scaling for a neural processing unit
title_fullStr A layer-wise frequency scaling for a neural processing unit
title_full_unstemmed A layer-wise frequency scaling for a neural processing unit
title_short A layer-wise frequency scaling for a neural processing unit
title_sort layer wise frequency scaling for a neural processing unit
topic dynamic frequency scaling
neural processing unit
power model
url https://doi.org/10.4218/etrij.2022-0094
work_keys_str_mv AT jaehoonchung alayerwisefrequencyscalingforaneuralprocessingunit
AT hyunmikim alayerwisefrequencyscalingforaneuralprocessingunit
AT kyoungseonshin alayerwisefrequencyscalingforaneuralprocessingunit
AT chungilyuh alayerwisefrequencyscalingforaneuralprocessingunit
AT yongcheolpetercho alayerwisefrequencyscalingforaneuralprocessingunit
AT jinhohan alayerwisefrequencyscalingforaneuralprocessingunit
AT youngsukwon alayerwisefrequencyscalingforaneuralprocessingunit
AT younghogong alayerwisefrequencyscalingforaneuralprocessingunit
AT sungwoochung alayerwisefrequencyscalingforaneuralprocessingunit
AT jaehoonchung layerwisefrequencyscalingforaneuralprocessingunit
AT hyunmikim layerwisefrequencyscalingforaneuralprocessingunit
AT kyoungseonshin layerwisefrequencyscalingforaneuralprocessingunit
AT chungilyuh layerwisefrequencyscalingforaneuralprocessingunit
AT yongcheolpetercho layerwisefrequencyscalingforaneuralprocessingunit
AT jinhohan layerwisefrequencyscalingforaneuralprocessingunit
AT youngsukwon layerwisefrequencyscalingforaneuralprocessingunit
AT younghogong layerwisefrequencyscalingforaneuralprocessingunit
AT sungwoochung layerwisefrequencyscalingforaneuralprocessingunit