How many bits does it take to quantize your neural network?
Quantization converts neural networks into low-bit fixed-point computations which can be carried out by efficient integer-only hardware, and is standard practice for the deployment of neural networks on real-time embedded devices. However, like their real-numbered counterpart, quantized networks are...
Main Authors: | Giacobbe, M, Henzinger, TA, Lechner, M |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Springer
2020
|
Similar Items
-
Design of a 2-Bit Neural Network Quantizer for Laplacian Source
by: Zoran Perić, et al.
Published: (2021-07-01) -
Robust 2-bit Quantization of Weights in Neural Network Modeled by Laplacian Distribution
by: PERIC, Z., et al.
Published: (2021-08-01) -
How many dissenters does it take to disorder a flock?
by: D Yllanes, et al.
Published: (2017-01-01) -
CANET: Quantized Neural Network Inference With 8-bit Carry-Aware Accumulator
by: Jingxuan Yang, et al.
Published: (2024-01-01) -
4.6-Bit Quantization for Fast and Accurate Neural Network Inference on CPUs
by: Anton Trusov, et al.
Published: (2024-02-01)