AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for A Convolutional Neural Network
Blockwise reconstruction with adaptive rounding helps achieve acceptable 4-bit post-training quantization accuracy. However, adaptive rounding is time intensive, and the optimization space of weight elements is constrained to a binary set, thus limiting the performance of quantized models. The optim...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-02-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/13/3/644 |