4.6-Bit Quantization for Fast and Accurate Neural Network Inference on CPUs
Quantization is a widespread method for reducing the inference time of neural networks on mobile Central Processing Units (CPUs). Eight-bit quantized networks demonstrate similarly high quality as full precision models and perfectly fit the hardware architecture with one-byte coefficients and thirty...
Main Authors: | Anton Trusov, Elena Limonova, Dmitry Nikolaev, Vladimir V. Arlazarov |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-02-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/12/5/651 |
Similar Items
-
Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
by: Artem Sher, et al.
Published: (2023-04-01) -
CANET: Quantized Neural Network Inference With 8-bit Carry-Aware Accumulator
by: Jingxuan Yang, et al.
Published: (2024-01-01) -
Training Multi-Bit Quantized and Binarized Networks with a Learnable Symmetric Quantizer
by: Phuoc Pham, et al.
Published: (2021-01-01) -
Design of a 2-Bit Neural Network Quantizer for Laplacian Source
by: Zoran Perić, et al.
Published: (2021-07-01) -
Clipping-Based Post Training 8-Bit Quantization of Convolution Neural Networks for Object Detection
by: Leisheng Chen, et al.
Published: (2022-12-01)