HAQ: Hardware-Aware Automated Quantization With Mixed Precision
Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency, which raises a great challenge to find the optimal bitwidth for...
Glavni autori: | Wang, Kuan, Liu, Zhijian, Lin, Yujun, Lin, Ji, Han, Song |
---|---|
Daljnji autori: | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
Format: | Članak |
Jezik: | English |
Izdano: |
Institute of Electrical and Electronics Engineers (IEEE)
2021
|
Online pristup: | https://hdl.handle.net/1721.1/129522 |
Slični predmeti
-
Hardware-Centric AutoML for Mixed-Precision Quantization
od: Wang, Kuan, i dr.
Izdano: (2021) -
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
od: Wang, Tianzhe, i dr.
Izdano: (2021) -
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
od: Wang, Hanrui, i dr.
Izdano: (2022) -
Wujud Al-Haq /
od: 392047 Jamaluddin Kafie
Izdano: (1983) -
Izhar al-haq /
od: 307606 Rahmat Allah ibn Khalil al-Rahman
Izdano: (2001)