Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy

Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in...

Full description

Bibliographic Details
Main Authors:	Maarten Vandersteegen, Kristof Van Beeck, Toon Goedemé
Format:	Article
Language:	English
Published:	MDPI AG 2021-11-01
Series:	Electronics
Subjects:	quantization neural networks nonuniform power-of-two scales low-cost hardware
Online Access:	https://www.mdpi.com/2079-9292/10/22/2823

Internet

https://www.mdpi.com/2079-9292/10/22/2823

Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy

Internet

Similar Items