Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization
We consider the problem of accurate quantization for language models, where both the weights and activations are quantized to 4 bits per parameter with uniform quantization, the lowest bitwidth format natively supported by existing GPU hardware. In this context, the key challenge is activation quant...
Main Author: | Nrusimha, Aniruddha |
---|---|
Other Authors: | Kim, Yoon |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/156280 |
Similar Items
-
Soft Quantization Using Entropic Regularization
by: Rajmadan Lakshmanan, et al.
Published: (2023-10-01) -
O-2A: Outlier-Aware Compression for 8-bit Post-Training Quantization Model
by: Nguyen-Dong Ho, et al.
Published: (2023-01-01) -
Outliers in diffusion-weighted MRI: Exploring detection models and mitigation strategies
by: Viljami Sairanen, et al.
Published: (2023-12-01) -
Evaluation of Model Quantization Method on Vitis-AI for Mitigating Adversarial Examples
by: Yuta Fukuda, et al.
Published: (2023-01-01) -
Avoided level crossings in the quantization of a mixed regular-chaotic system.
by: Mainiero, T, et al.
Published: (2007)