Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
Quantized neural networks (QNNs) are widely used to achieve computationally efficient solutions to recognition problems. Overall, eight-bit QNNs have almost the same accuracy as full-precision networks, but working several times faster. However, the networks with lower quantization levels demonstrat...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-04-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/11/9/2112 |
_version_ | 1797602185183232000 |
---|---|
author | Artem Sher Anton Trusov Elena Limonova Dmitry Nikolaev Vladimir V. Arlazarov |
author_facet | Artem Sher Anton Trusov Elena Limonova Dmitry Nikolaev Vladimir V. Arlazarov |
author_sort | Artem Sher |
collection | DOAJ |
description | Quantized neural networks (QNNs) are widely used to achieve computationally efficient solutions to recognition problems. Overall, eight-bit QNNs have almost the same accuracy as full-precision networks, but working several times faster. However, the networks with lower quantization levels demonstrate inferior accuracy in comparison to their classical analogs. To solve this issue, a number of quantization-aware training (QAT) approaches were proposed. In this paper, we study QAT approaches for two- to eight-bit linear quantization schemes and propose a new combined QAT approach: neuron-by-neuron quantization with straight-through estimator (STE) gradient forwarding. It is suitable for quantizations with two- to eight-bit widths and eliminates significant accuracy drops during training, which results in better accuracy of the final QNN. We experimentally evaluate our approach on CIFAR-10 and ImageNet classification and show that it is comparable to other approaches for four to eight bits and outperforms some of them for two to three bits while being easier to implement. For example, the proposed approach to three-bit quantization of the CIFAR-10 dataset results in 73.2% accuracy, while baseline direct and layer-by-layer result in 71.4% and 67.2% accuracy, respectively. The results for two-bit quantization for ResNet18 on the ImageNet dataset are 63.69% for our approach and 61.55% for the direct baseline. |
first_indexed | 2024-03-11T04:13:29Z |
format | Article |
id | doaj.art-b80ce709a03c4ec79bbbb2c512eee0eb |
institution | Directory Open Access Journal |
issn | 2227-7390 |
language | English |
last_indexed | 2024-03-11T04:13:29Z |
publishDate | 2023-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Mathematics |
spelling | doaj.art-b80ce709a03c4ec79bbbb2c512eee0eb2023-11-17T23:20:11ZengMDPI AGMathematics2227-73902023-04-01119211210.3390/math11092112Neuron-by-Neuron Quantization for Efficient Low-Bit QNN TrainingArtem Sher0Anton Trusov1Elena Limonova2Dmitry Nikolaev3Vladimir V. Arlazarov4Phystech School of Applied Mathematics and Informatics, Moscow Institute of Physics and Technology, 141701 Moscow, RussiaPhystech School of Applied Mathematics and Informatics, Moscow Institute of Physics and Technology, 141701 Moscow, RussiaSmart Engines Service LLC, 117312 Moscow, RussiaSmart Engines Service LLC, 117312 Moscow, RussiaSmart Engines Service LLC, 117312 Moscow, RussiaQuantized neural networks (QNNs) are widely used to achieve computationally efficient solutions to recognition problems. Overall, eight-bit QNNs have almost the same accuracy as full-precision networks, but working several times faster. However, the networks with lower quantization levels demonstrate inferior accuracy in comparison to their classical analogs. To solve this issue, a number of quantization-aware training (QAT) approaches were proposed. In this paper, we study QAT approaches for two- to eight-bit linear quantization schemes and propose a new combined QAT approach: neuron-by-neuron quantization with straight-through estimator (STE) gradient forwarding. It is suitable for quantizations with two- to eight-bit widths and eliminates significant accuracy drops during training, which results in better accuracy of the final QNN. We experimentally evaluate our approach on CIFAR-10 and ImageNet classification and show that it is comparable to other approaches for four to eight bits and outperforms some of them for two to three bits while being easier to implement. For example, the proposed approach to three-bit quantization of the CIFAR-10 dataset results in 73.2% accuracy, while baseline direct and layer-by-layer result in 71.4% and 67.2% accuracy, respectively. The results for two-bit quantization for ResNet18 on the ImageNet dataset are 63.69% for our approach and 61.55% for the direct baseline.https://www.mdpi.com/2227-7390/11/9/2112quantized neural networklow-bit quantizationlayer-by-layerneuron-by-neuron training |
spellingShingle | Artem Sher Anton Trusov Elena Limonova Dmitry Nikolaev Vladimir V. Arlazarov Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training Mathematics quantized neural network low-bit quantization layer-by-layer neuron-by-neuron training |
title | Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training |
title_full | Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training |
title_fullStr | Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training |
title_full_unstemmed | Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training |
title_short | Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training |
title_sort | neuron by neuron quantization for efficient low bit qnn training |
topic | quantized neural network low-bit quantization layer-by-layer neuron-by-neuron training |
url | https://www.mdpi.com/2227-7390/11/9/2112 |
work_keys_str_mv | AT artemsher neuronbyneuronquantizationforefficientlowbitqnntraining AT antontrusov neuronbyneuronquantizationforefficientlowbitqnntraining AT elenalimonova neuronbyneuronquantizationforefficientlowbitqnntraining AT dmitrynikolaev neuronbyneuronquantizationforefficientlowbitqnntraining AT vladimirvarlazarov neuronbyneuronquantizationforefficientlowbitqnntraining |