Kernel Quantization for Efficient Network Compression

Kernel Quantization for Efficient Network Compression

This paper presents a novel network compression framework, <bold>Kernel Quantization</bold> (<bold><italic>KQ</italic></bold>), targeting to efficiently convert any pre-trained full-precision convolutional neural network (CNN) model into a low-precision version wi...

Full description

Bibliographic Details
Main Authors:	Zhongzhi Yu, Yemin Shi
Format:	Article
Language:	English
Published:	IEEE 2022-01-01
Series:	IEEE Access
Subjects:	Kernel quantization network compression convolutional neural network low-precision low-bit
Online Access:	https://ieeexplore.ieee.org/document/9672186/

Similar Items

HLQ: Hardware-Friendly Logarithmic Quantization Aware Training for Power-Efficient Low-Precision CNN Models
by: Dahun Choi, et al.
Published: (2024-01-01)

Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training
by: Artem Sher, et al.
Published: (2023-04-01)

CANET: Quantized Neural Network Inference With 8-bit Carry-Aware Accumulator
by: Jingxuan Yang, et al.
Published: (2024-01-01)

Efficient Weights Quantization of Convolutional Neural Networks Using Kernel Density Estimation based Non-uniform Quantizer
by: Sanghyun Seo, et al.
Published: (2019-06-01)

A Convolutional Neural Network-Based Quantization Method for Block Compressed Sensing of Images
by: Jiulu Gong, et al.
Published: (2024-05-01)

A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
by: Xuefu Sui, et al.
Published: (2022-09-01)

Comprehensive Comparisons of Uniform Quantization in Deep Image Compression
by: Koki Tsubota, et al.
Published: (2023-01-01)

Subset-Selection Weight Post-Training Quantization Method for Learned Image Compression Task
by: Jinru Yang, et al.
Published: (2025-01-01)

A survey of quantization methods for deep neural networks
by: Chun YANG, et al.
Published: (2023-10-01)

Network Compression via Mixed Precision Quantization Using a Multi-Layer Perceptron for the Bit-Width Allocation
by: Efstathia Soufleri, et al.
Published: (2021-01-01)

V-SKP: Vectorized Kernel-Based Structured Kernel Pruning for Accelerating Deep Convolutional Neural Networks
by: Kwanghyun Koo, et al.
Published: (2023-01-01)

NICE: Noise Injection and Clamping Estimation for Neural Network Quantization
by: Chaim Baskin, et al.
Published: (2021-09-01)

Heuristic Compression Method for CNN Model Applying Quantization to a Combination of Structured and Unstructured Pruning Techniques
by: Danhe Tian, et al.
Published: (2024-01-01)

Progressive Bitwidth Assignment Approaches for Efficient Capsule Networks Quantization
by: Mohsen Raji, et al.
Published: (2025-01-01)

Feature Affinity Assisted Knowledge Distillation and Quantization of Deep Neural Networks on Label-Free Data
by: Zhijian Li, et al.
Published: (2023-01-01)

IKW: Inter-Kernel Weights for Power Efficient Edge Computing
by: Pramod Udupa, et al.
Published: (2020-01-01)

Trainable quantization for Speedy Spiking Neural Networks
by: Andrea Castagnetti, et al.
Published: (2023-03-01)

Moving Targets Detection with Low-bit Quantization in Distributed Radar on Moving Platforms
by: Shixing YANG, et al.
Published: (2024-06-01)

Quantization Robust Pruning With Knowledge Distillation
by: Jangho Kim
Published: (2023-01-01)

Advances in Pruning and Quantization for Natural Language Processing
by: Ummara Bibi, et al.
Published: (2024-01-01)

GradFreeBits: Gradient-Free Bit Allocation for Mixed-Precision Neural Networks
by: Benjamin Jacob Bodner, et al.
Published: (2022-12-01)

Zero-Centered Fixed-Point Quantization With Iterative Retraining for Deep Convolutional Neural Network-Based Object Detectors
by: Sungrae Kim, et al.
Published: (2021-01-01)

Arithmetic Coding-Based 5-Bit Weight Encoding and Hardware Decoder for CNN Inference in Edge Devices
by: Jong Hun Lee, et al.
Published: (2021-01-01)

Neural Networks Integer Computation: Quantizing Convolutional Neural Networks of Inference and Training for Object Detection in Embedded Systems
by: Penghao Xiao, et al.
Published: (2024-01-01)

Spatial Shift Point-Wise Quantization
by: Eunhui Kim, et al.
Published: (2020-01-01)

Optimizing Spatial Shift Point-Wise Quantization
by: Eunhui Kim, et al.
Published: (2021-01-01)

Adaptive Global Power-of-Two Ternary Quantization Algorithm Based on Unfixed Boundary Thresholds
by: Xuefu Sui, et al.
Published: (2023-12-01)

Optimization of Linear Quantization for General and Effective Low Bit-Width Network Compression
by: Wenxin Yang, et al.
Published: (2023-01-01)

Quantization and sparsity-aware processing for energy-efficient NVM-based convolutional neural networks
by: Han Bao, et al.
Published: (2022-08-01)

8 kbps Speech Coding using KLMS Prediction, Look-Ahead Adaptive Quantization and Pre-Emphasized Noise Reduction
by: Ghasem Alipoor, et al.
Published: (2015-09-01)

Two Novel Non-Uniform Quantizers with Application in Post-Training Quantization
by: Zoran Perić, et al.
Published: (2022-09-01)

Motion estimation algorithm using 2 bit-depth pixel and fuzzy quantization
by: Chuan-ming SONG, et al.
Published: (2013-07-01)

Motion estimation algorithm using 2 bit-depth pixel and fuzzy quantization
by: Chuan-ming SONG, et al.
Published: (2013-07-01)

FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs With Dynamic Fixed-Point Representation
by: Ahmad Shawahna, et al.
Published: (2022-01-01)

Degree-Aware Graph Neural Network Quantization
by: Ziqin Fan, et al.
Published: (2023-11-01)

Model Compression for Deep Neural Networks: A Survey
by: Zhuo Li, et al.
Published: (2023-03-01)

Estimating Previous Quantization Factors on Multiple JPEG Compressed Images
by: Sebastiano Battiato, et al.
Published: (2021-06-01)

A Mutual Learning Framework for Pruned and Quantized Networks
by: Xiaohai Li, et al.
Published: (2023-04-01)

Enabling Intelligent IoTs for Histopathology Image Analysis Using Convolutional Neural Networks
by: Mohammed H. Alali, et al.
Published: (2022-08-01)

A Deep Learning Framework of Quantized Compressed Sensing for Wireless Neural Recording
by: Biao Sun, et al.
Published: (2016-01-01)