How many bits does it take to quantize your neural network?
Quantization converts neural networks into low-bit fixed-point computations which can be carried out by efficient integer-only hardware, and is standard practice for the deployment of neural networks on real-time embedded devices. However, like their real-numbered counterpart, quantized networks are...
Main Authors: | , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Springer
2020
|