Improving Model Capacity of Quantized Networks with Conditional Computation

Network quantization becomes a crucial step when deploying deep models to the edge devices as it is hardware-friendly, offers memory and computational advantages, but it also suffers performance degradation as the result of limited representation capability. We address this issue by introducing cond...

Full description

Bibliographic Details
Main Authors: Phuoc Pham, Jaeyong Chung
Format: Article
Language:English
Published: MDPI AG 2021-04-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/10/8/886
_version_ 1797538382967996416
author Phuoc Pham
Jaeyong Chung
author_facet Phuoc Pham
Jaeyong Chung
author_sort Phuoc Pham
collection DOAJ
description Network quantization becomes a crucial step when deploying deep models to the edge devices as it is hardware-friendly, offers memory and computational advantages, but it also suffers performance degradation as the result of limited representation capability. We address this issue by introducing conditional computing to low-bit quantized networks. Instead of using a fixed, single kernel for each layer, which usually does not generalize well across all input data, our proposed method tries to use multiple parallel kernels dynamically in conjunction with the winner-takes-all gating mechanism to select the best one to propagate information. Overall, our proposed method improves upon the prior work, without adding much computational overhead, results in better classification performance on the CIFAR-10 and CIFAR-100 datasets.
first_indexed 2024-03-10T12:30:45Z
format Article
id doaj.art-3ddca0f15b0a4de7a493c40a706b343f
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-10T12:30:45Z
publishDate 2021-04-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-3ddca0f15b0a4de7a493c40a706b343f2023-11-21T14:40:11ZengMDPI AGElectronics2079-92922021-04-0110888610.3390/electronics10080886Improving Model Capacity of Quantized Networks with Conditional ComputationPhuoc Pham0Jaeyong Chung1System on Chips Laboratory, Department of Electronics Engineering, Incheon National University, Incheon 22012, KoreaSystem on Chips Laboratory, Department of Electronics Engineering, Incheon National University, Incheon 22012, KoreaNetwork quantization becomes a crucial step when deploying deep models to the edge devices as it is hardware-friendly, offers memory and computational advantages, but it also suffers performance degradation as the result of limited representation capability. We address this issue by introducing conditional computing to low-bit quantized networks. Instead of using a fixed, single kernel for each layer, which usually does not generalize well across all input data, our proposed method tries to use multiple parallel kernels dynamically in conjunction with the winner-takes-all gating mechanism to select the best one to propagate information. Overall, our proposed method improves upon the prior work, without adding much computational overhead, results in better classification performance on the CIFAR-10 and CIFAR-100 datasets.https://www.mdpi.com/2079-9292/10/8/886quantized networksmodel compressiondynamic neural networkconditional computingmodel capacitymodel representation
spellingShingle Phuoc Pham
Jaeyong Chung
Improving Model Capacity of Quantized Networks with Conditional Computation
Electronics
quantized networks
model compression
dynamic neural network
conditional computing
model capacity
model representation
title Improving Model Capacity of Quantized Networks with Conditional Computation
title_full Improving Model Capacity of Quantized Networks with Conditional Computation
title_fullStr Improving Model Capacity of Quantized Networks with Conditional Computation
title_full_unstemmed Improving Model Capacity of Quantized Networks with Conditional Computation
title_short Improving Model Capacity of Quantized Networks with Conditional Computation
title_sort improving model capacity of quantized networks with conditional computation
topic quantized networks
model compression
dynamic neural network
conditional computing
model capacity
model representation
url https://www.mdpi.com/2079-9292/10/8/886
work_keys_str_mv AT phuocpham improvingmodelcapacityofquantizednetworkswithconditionalcomputation
AT jaeyongchung improvingmodelcapacityofquantizednetworkswithconditionalcomputation