Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training

Quantized neural networks (QNNs) are widely used to achieve computationally efficient solutions to recognition problems. Overall, eight-bit QNNs have almost the same accuracy as full-precision networks, but working several times faster. However, the networks with lower quantization levels demonstrat...

Full description

Bibliographic Details
Main Authors:	Artem Sher, Anton Trusov, Elena Limonova, Dmitry Nikolaev, Vladimir V. Arlazarov
Format:	Article
Language:	English
Published:	MDPI AG 2023-04-01
Series:	Mathematics
Subjects:	quantized neural network low-bit quantization layer-by-layer neuron-by-neuron training
Online Access:	https://www.mdpi.com/2227-7390/11/9/2112

_version_	1797602185183232000
author	Artem Sher Anton Trusov Elena Limonova Dmitry Nikolaev Vladimir V. Arlazarov
author_facet	Artem Sher Anton Trusov Elena Limonova Dmitry Nikolaev Vladimir V. Arlazarov
author_sort	Artem Sher
collection	DOAJ
description	Quantized neural networks (QNNs) are widely used to achieve computationally efficient solutions to recognition problems. Overall, eight-bit QNNs have almost the same accuracy as full-precision networks, but working several times faster. However, the networks with lower quantization levels demonstrate inferior accuracy in comparison to their classical analogs. To solve this issue, a number of quantization-aware training (QAT) approaches were proposed. In this paper, we study QAT approaches for two- to eight-bit linear quantization schemes and propose a new combined QAT approach: neuron-by-neuron quantization with straight-through estimator (STE) gradient forwarding. It is suitable for quantizations with two- to eight-bit widths and eliminates significant accuracy drops during training, which results in better accuracy of the final QNN. We experimentally evaluate our approach on CIFAR-10 and ImageNet classification and show that it is comparable to other approaches for four to eight bits and outperforms some of them for two to three bits while being easier to implement. For example, the proposed approach to three-bit quantization of the CIFAR-10 dataset results in 73.2% accuracy, while baseline direct and layer-by-layer result in 71.4% and 67.2% accuracy, respectively. The results for two-bit quantization for ResNet18 on the ImageNet dataset are 63.69% for our approach and 61.55% for the direct baseline.
first_indexed	2024-03-11T04:13:29Z
format	Article
id	doaj.art-b80ce709a03c4ec79bbbb2c512eee0eb
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-11T04:13:29Z
publishDate	2023-04-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-b80ce709a03c4ec79bbbb2c512eee0eb2023-11-17T23:20:11ZengMDPI AGMathematics2227-73902023-04-01119211210.3390/math11092112Neuron-by-Neuron Quantization for Efficient Low-Bit QNN TrainingArtem Sher0Anton Trusov1Elena Limonova2Dmitry Nikolaev3Vladimir V. Arlazarov4Phystech School of Applied Mathematics and Informatics, Moscow Institute of Physics and Technology, 141701 Moscow, RussiaPhystech School of Applied Mathematics and Informatics, Moscow Institute of Physics and Technology, 141701 Moscow, RussiaSmart Engines Service LLC, 117312 Moscow, RussiaSmart Engines Service LLC, 117312 Moscow, RussiaSmart Engines Service LLC, 117312 Moscow, RussiaQuantized neural networks (QNNs) are widely used to achieve computationally efficient solutions to recognition problems. Overall, eight-bit QNNs have almost the same accuracy as full-precision networks, but working several times faster. However, the networks with lower quantization levels demonstrate inferior accuracy in comparison to their classical analogs. To solve this issue, a number of quantization-aware training (QAT) approaches were proposed. In this paper, we study QAT approaches for two- to eight-bit linear quantization schemes and propose a new combined QAT approach: neuron-by-neuron quantization with straight-through estimator (STE) gradient forwarding. It is suitable for quantizations with two- to eight-bit widths and eliminates significant accuracy drops during training, which results in better accuracy of the final QNN. We experimentally evaluate our approach on CIFAR-10 and ImageNet classification and show that it is comparable to other approaches for four to eight bits and outperforms some of them for two to three bits while being easier to implement. For example, the proposed approach to three-bit quantization of the CIFAR-10 dataset results in 73.2% accuracy, while baseline direct and layer-by-layer result in 71.4% and 67.2% accuracy, respectively. The results for two-bit quantization for ResNet18 on the ImageNet dataset are 63.69% for our approach and 61.55% for the direct baseline.https://www.mdpi.com/2227-7390/11/9/2112quantized neural networklow-bit quantizationlayer-by-layerneuron-by-neuron training
spellingShingle	Artem Sher Anton Trusov Elena Limonova Dmitry Nikolaev Vladimir V. Arlazarov Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training Mathematics quantized neural network low-bit quantization layer-by-layer neuron-by-neuron training
title	Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
title_full	Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
title_fullStr	Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
title_full_unstemmed	Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
title_short	Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
title_sort	neuron by neuron quantization for efficient low bit qnn training
topic	quantized neural network low-bit quantization layer-by-layer neuron-by-neuron training
url	https://www.mdpi.com/2227-7390/11/9/2112
work_keys_str_mv	AT artemsher neuronbyneuronquantizationforefficientlowbitqnntraining AT antontrusov neuronbyneuronquantizationforefficientlowbitqnntraining AT elenalimonova neuronbyneuronquantizationforefficientlowbitqnntraining AT dmitrynikolaev neuronbyneuronquantizationforefficientlowbitqnntraining AT vladimirvarlazarov neuronbyneuronquantizationforefficientlowbitqnntraining

Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training

Similar Items