Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy

Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in...

Full description

Bibliographic Details
Main Authors:	Maarten Vandersteegen, Kristof Van Beeck, Toon Goedemé
Format:	Article
Language:	English
Published:	MDPI AG 2021-11-01
Series:	Electronics
Subjects:	quantization neural networks nonuniform power-of-two scales low-cost hardware
Online Access:	https://www.mdpi.com/2079-9292/10/22/2823

_version_	1797510525569990656
author	Maarten Vandersteegen Kristof Van Beeck Toon Goedemé
author_facet	Maarten Vandersteegen Kristof Van Beeck Toon Goedemé
author_sort	Maarten Vandersteegen
collection	DOAJ
description	Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in the literature. To avoid major degradation of accuracy, common quantization methods introduce additional scale factors to adapt the quantized values to the diverse data ranges, present in full-precision (floating-point) neural networks. These scales are usually kept in high precision, requiring the target compute engine to support a few high-precision multiplications, which is not desirable due to the larger hardware cost. Little effort has yet been invested in trying to avoid high-precision multipliers altogether, especially in combination with 4 bit weights. This work proposes a new quantization scheme, based on power-of-two quantization scales, that works on-par compared to uniform per-channel quantization with full-precision 32 bit quantization scales when using only 4 bit weights. This is done through the addition of a low-precision lookup-table that translates stored 4 bit weights into nonuniformly distributed 8 bit weights for internal computation. All our quantized ImageNet CNNs achieved or even exceeded the Top-1 accuracy of their full-precision counterparts, with ResNet18 exceeding its full-precision model by 0.35%. Our MobileNetV2 model achieved state-of-the-art performance with only a slight drop in accuracy of 0.51%.
first_indexed	2024-03-10T05:33:30Z
format	Article
id	doaj.art-9b83f42050394e609be6a8c4a4b79011
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-10T05:33:30Z
publishDate	2021-11-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-9b83f42050394e609be6a8c4a4b790112023-11-22T23:07:38ZengMDPI AGElectronics2079-92922021-11-011022282310.3390/electronics10222823Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision AccuracyMaarten Vandersteegen0Kristof Van Beeck1Toon Goedemé2KU Leuven, EAVISE-Jan Pieter De Nayerlaan 5, 2860 Sint-Katelijne-Waver, BelgiumKU Leuven, EAVISE-Jan Pieter De Nayerlaan 5, 2860 Sint-Katelijne-Waver, BelgiumKU Leuven, EAVISE-Jan Pieter De Nayerlaan 5, 2860 Sint-Katelijne-Waver, BelgiumQuantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in the literature. To avoid major degradation of accuracy, common quantization methods introduce additional scale factors to adapt the quantized values to the diverse data ranges, present in full-precision (floating-point) neural networks. These scales are usually kept in high precision, requiring the target compute engine to support a few high-precision multiplications, which is not desirable due to the larger hardware cost. Little effort has yet been invested in trying to avoid high-precision multipliers altogether, especially in combination with 4 bit weights. This work proposes a new quantization scheme, based on power-of-two quantization scales, that works on-par compared to uniform per-channel quantization with full-precision 32 bit quantization scales when using only 4 bit weights. This is done through the addition of a low-precision lookup-table that translates stored 4 bit weights into nonuniformly distributed 8 bit weights for internal computation. All our quantized ImageNet CNNs achieved or even exceeded the Top-1 accuracy of their full-precision counterparts, with ResNet18 exceeding its full-precision model by 0.35%. Our MobileNetV2 model achieved state-of-the-art performance with only a slight drop in accuracy of 0.51%.https://www.mdpi.com/2079-9292/10/22/2823quantizationneural networksnonuniformpower-of-two scaleslow-cost hardware
spellingShingle	Maarten Vandersteegen Kristof Van Beeck Toon Goedemé Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy Electronics quantization neural networks nonuniform power-of-two scales low-cost hardware
title	Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
title_full	Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
title_fullStr	Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
title_full_unstemmed	Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
title_short	Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
title_sort	integer only cnns with 4 bit weights and bit shift quantization scales at full precision accuracy
topic	quantization neural networks nonuniform power-of-two scales low-cost hardware
url	https://www.mdpi.com/2079-9292/10/22/2823
work_keys_str_mv	AT maartenvandersteegen integeronlycnnswith4bitweightsandbitshiftquantizationscalesatfullprecisionaccuracy AT kristofvanbeeck integeronlycnnswith4bitweightsandbitshiftquantizationscalesatfullprecisionaccuracy AT toongoedeme integeronlycnnswith4bitweightsandbitshiftquantizationscalesatfullprecisionaccuracy

Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy

Similar Items