Optimization of Linear Quantization for General and Effective Low Bit-Width Network Compression

Current edge devices for neural networks such as FPGA, CPLD, and ASIC can support low bit-width computing to improve the execution latency and energy efficiency, but traditional linear quantization can only maintain the inference accuracy of neural networks at a bit-width above 6 bits. Different fro...

Full description

Bibliographic Details
Main Authors: Wenxin Yang, Xiaoli Zhi, Weiqin Tong
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/16/1/31