HACScale : hardware-aware compound scaling for resource-efficient DNNs

Model scaling is an effective way to improve the accuracy of deep neural networks (DNNs) by increasing the model capacity. However, existing approaches seldom consider the underlying hardware, causing inefficient utilization of hardware resources and consequently high inference latency. In this pape...

Full description

Bibliographic Details
Main Authors:	Kong, Hao, Liu, Di, Luo, Xiangzhong, Liu, Weichen, Subramaniam, Ravi
Other Authors:	School of Computer Science and Engineering
Format:	Conference Paper
Language:	English
Published:	2022
Subjects:	Engineering::Computer science and engineering Deep Learning Design Automation
Online Access:	https://hdl.handle.net/10356/155808

_version_	1811688630653026304
author	Kong, Hao Liu, Di Luo, Xiangzhong Liu, Weichen Subramaniam, Ravi
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Kong, Hao Liu, Di Luo, Xiangzhong Liu, Weichen Subramaniam, Ravi
author_sort	Kong, Hao
collection	NTU
description	Model scaling is an effective way to improve the accuracy of deep neural networks (DNNs) by increasing the model capacity. However, existing approaches seldom consider the underlying hardware, causing inefficient utilization of hardware resources and consequently high inference latency. In this paper, we propose HACScale, a hardware-aware model scaling strategy to fully exploit hardware resources for higher accuracy. In HACScale, different dimensions of DNNs are jointly scaled with consideration of their contributions to hardware utilization and accuracy. To improve the efficiency of width scaling, we introduce importance-aware width scaling in HACScale, which computes the importance of each layer to the accuracy and scales each layer accordingly to optimize the trade-off between accuracy and model parameters. Experiments show that HACScale improves the hardware utilization by 1.92× on ImageNet, as a result, it achieves 2.41% accuracy improvement with a negligible latency increase of 0.6%. On CIFAR-10, HACScale improves the accuracy by 2.23% with only 6.5% latency growth.
first_indexed	2024-10-01T05:35:16Z
format	Conference Paper
id	ntu-10356/155808
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T05:35:16Z
publishDate	2022
record_format	dspace
spelling	ntu-10356/1558082023-12-15T03:06:01Z HACScale : hardware-aware compound scaling for resource-efficient DNNs Kong, Hao Liu, Di Luo, Xiangzhong Liu, Weichen Subramaniam, Ravi School of Computer Science and Engineering 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) HP-NTU Digital Manufacturing Corporate Lab Engineering::Computer science and engineering Deep Learning Design Automation Model scaling is an effective way to improve the accuracy of deep neural networks (DNNs) by increasing the model capacity. However, existing approaches seldom consider the underlying hardware, causing inefficient utilization of hardware resources and consequently high inference latency. In this paper, we propose HACScale, a hardware-aware model scaling strategy to fully exploit hardware resources for higher accuracy. In HACScale, different dimensions of DNNs are jointly scaled with consideration of their contributions to hardware utilization and accuracy. To improve the efficiency of width scaling, we introduce importance-aware width scaling in HACScale, which computes the importance of each layer to the accuracy and scales each layer accordingly to optimize the trade-off between accuracy and model parameters. Experiments show that HACScale improves the hardware utilization by 1.92× on ImageNet, as a result, it achieves 2.41% accuracy improvement with a negligible latency increase of 0.6%. On CIFAR-10, HACScale improves the accuracy by 2.23% with only 6.5% latency growth. Nanyang Technological University National Research Foundation (NRF) Submitted/Accepted version This study is supported under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner, HP Inc., through the HP-NTU Digital Manufacturing Corporate Lab. This work is also partially supported by NTU NAP M4082282 and SUG M4082087, Singapore. 2022-03-22T02:32:35Z 2022-03-22T02:32:35Z 2022 Conference Paper Kong, H., Liu, D., Luo, X., Liu, W. & Subramaniam, R. (2022). HACScale : hardware-aware compound scaling for resource-efficient DNNs. 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC), 708-713. https://dx.doi.org/10.1109/ASP-DAC52403.2022.9712593 9781665421355 https://hdl.handle.net/10356/155808 10.1109/ASP-DAC52403.2022.9712593 2-s2.0-85122940747 708 713 en M4082282 M4082087 10.21979/N9/KSXK4T © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/ASP-DAC52403.2022.9712593. application/pdf
spellingShingle	Engineering::Computer science and engineering Deep Learning Design Automation Kong, Hao Liu, Di Luo, Xiangzhong Liu, Weichen Subramaniam, Ravi HACScale : hardware-aware compound scaling for resource-efficient DNNs
title	HACScale : hardware-aware compound scaling for resource-efficient DNNs
title_full	HACScale : hardware-aware compound scaling for resource-efficient DNNs
title_fullStr	HACScale : hardware-aware compound scaling for resource-efficient DNNs
title_full_unstemmed	HACScale : hardware-aware compound scaling for resource-efficient DNNs
title_short	HACScale : hardware-aware compound scaling for resource-efficient DNNs
title_sort	hacscale hardware aware compound scaling for resource efficient dnns
topic	Engineering::Computer science and engineering Deep Learning Design Automation
url	https://hdl.handle.net/10356/155808
work_keys_str_mv	AT konghao hacscalehardwareawarecompoundscalingforresourceefficientdnns AT liudi hacscalehardwareawarecompoundscalingforresourceefficientdnns AT luoxiangzhong hacscalehardwareawarecompoundscalingforresourceefficientdnns AT liuweichen hacscalehardwareawarecompoundscalingforresourceefficientdnns AT subramaniamravi hacscalehardwareawarecompoundscalingforresourceefficientdnns

HACScale : hardware-aware compound scaling for resource-efficient DNNs

Similar Items