Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in...
Main Authors: | Maarten Vandersteegen, Kristof Van Beeck, Toon Goedemé |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-11-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/10/22/2823 |
Similar Items
-
A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
by: Xuefu Sui, et al.
Published: (2022-09-01) -
Latitude-Adaptive Integer Bit Allocation for Quantization of Omnidirectional Images
by: Qian Sima, et al.
Published: (2024-02-01) -
Training Multi-Bit Quantized and Binarized Networks with a Learnable Symmetric Quantizer
by: Phuoc Pham, et al.
Published: (2021-01-01) -
Entropy-Constrained Scalar Quantization with a Lossy-Compressed Bit
by: Melanie F. Pradier, et al.
Published: (2016-12-01) -
Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
by: Artem Sher, et al.
Published: (2023-04-01)