A layer-wise frequency scaling for a neural processing unit
Dynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Electronics and Telecommunications Research Institute (ETRI)
2022-10-01
|
Series: | ETRI Journal |
Subjects: | |
Online Access: | https://doi.org/10.4218/etrij.2022-0094 |
_version_ | 1811223736820432896 |
---|---|
author | Jaehoon Chung HyunMi Kim Kyoungseon Shin Chun-Gi Lyuh Yong Cheol Peter Cho Jinho Han Youngsu Kwon Young-Ho Gong Sung Woo Chung |
author_facet | Jaehoon Chung HyunMi Kim Kyoungseon Shin Chun-Gi Lyuh Yong Cheol Peter Cho Jinho Han Youngsu Kwon Young-Ho Gong Sung Woo Chung |
author_sort | Jaehoon Chung |
collection | DOAJ |
description | Dynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power behavior and performance of each layer. Unfortunately, DVFS is inappropriate for layer-wise run-time power management of NPUs due to the long latency of voltage scaling compared with each layer execution time. Because the frequency scaling is fast enough to keep up with each layer, we propose a layerwise dynamic frequency scaling (DFS) technique for an NPU. Our proposed DFS exploits the highest frequency under the power limit of an NPU for each layer. To determine the highest allowable frequency, we build a power model to predict the power consumption of an NPU based on a real measurement on the fabricated NPU. Our evaluation results show that our proposed DFS improves frame per second (FPS) by 33% and saves energy by 14% on average, compared with DVFS. |
first_indexed | 2024-04-12T08:37:25Z |
format | Article |
id | doaj.art-7019db6b66c447488c26803a9450cb76 |
institution | Directory Open Access Journal |
issn | 1225-6463 |
language | English |
last_indexed | 2024-04-12T08:37:25Z |
publishDate | 2022-10-01 |
publisher | Electronics and Telecommunications Research Institute (ETRI) |
record_format | Article |
series | ETRI Journal |
spelling | doaj.art-7019db6b66c447488c26803a9450cb762022-12-22T03:39:59ZengElectronics and Telecommunications Research Institute (ETRI)ETRI Journal1225-64632022-10-0144584985810.4218/etrij.2022-009410.4218/etrij.2022-0094A layer-wise frequency scaling for a neural processing unitJaehoon ChungHyunMi KimKyoungseon ShinChun-Gi LyuhYong Cheol Peter ChoJinho HanYoungsu KwonYoung-Ho GongSung Woo ChungDynamic voltage frequency scaling (DVFS) has been widely adopted for runtime power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power behavior and performance of each layer. Unfortunately, DVFS is inappropriate for layer-wise run-time power management of NPUs due to the long latency of voltage scaling compared with each layer execution time. Because the frequency scaling is fast enough to keep up with each layer, we propose a layerwise dynamic frequency scaling (DFS) technique for an NPU. Our proposed DFS exploits the highest frequency under the power limit of an NPU for each layer. To determine the highest allowable frequency, we build a power model to predict the power consumption of an NPU based on a real measurement on the fabricated NPU. Our evaluation results show that our proposed DFS improves frame per second (FPS) by 33% and saves energy by 14% on average, compared with DVFS.https://doi.org/10.4218/etrij.2022-0094dynamic frequency scalingneural processing unitpower model |
spellingShingle | Jaehoon Chung HyunMi Kim Kyoungseon Shin Chun-Gi Lyuh Yong Cheol Peter Cho Jinho Han Youngsu Kwon Young-Ho Gong Sung Woo Chung A layer-wise frequency scaling for a neural processing unit ETRI Journal dynamic frequency scaling neural processing unit power model |
title | A layer-wise frequency scaling for a neural processing unit |
title_full | A layer-wise frequency scaling for a neural processing unit |
title_fullStr | A layer-wise frequency scaling for a neural processing unit |
title_full_unstemmed | A layer-wise frequency scaling for a neural processing unit |
title_short | A layer-wise frequency scaling for a neural processing unit |
title_sort | layer wise frequency scaling for a neural processing unit |
topic | dynamic frequency scaling neural processing unit power model |
url | https://doi.org/10.4218/etrij.2022-0094 |
work_keys_str_mv | AT jaehoonchung alayerwisefrequencyscalingforaneuralprocessingunit AT hyunmikim alayerwisefrequencyscalingforaneuralprocessingunit AT kyoungseonshin alayerwisefrequencyscalingforaneuralprocessingunit AT chungilyuh alayerwisefrequencyscalingforaneuralprocessingunit AT yongcheolpetercho alayerwisefrequencyscalingforaneuralprocessingunit AT jinhohan alayerwisefrequencyscalingforaneuralprocessingunit AT youngsukwon alayerwisefrequencyscalingforaneuralprocessingunit AT younghogong alayerwisefrequencyscalingforaneuralprocessingunit AT sungwoochung alayerwisefrequencyscalingforaneuralprocessingunit AT jaehoonchung layerwisefrequencyscalingforaneuralprocessingunit AT hyunmikim layerwisefrequencyscalingforaneuralprocessingunit AT kyoungseonshin layerwisefrequencyscalingforaneuralprocessingunit AT chungilyuh layerwisefrequencyscalingforaneuralprocessingunit AT yongcheolpetercho layerwisefrequencyscalingforaneuralprocessingunit AT jinhohan layerwisefrequencyscalingforaneuralprocessingunit AT youngsukwon layerwisefrequencyscalingforaneuralprocessingunit AT younghogong layerwisefrequencyscalingforaneuralprocessingunit AT sungwoochung layerwisefrequencyscalingforaneuralprocessingunit |