APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings,...

Full description

Bibliographic Details
Main Authors: Wang, Tianzhe, Wang, Kuan, Cai, Han, Lin, Ji, Liu, Zhijian, Han, Song
Other Authors: Massachusetts Institute of Technology. Microsystems Technology Laboratories
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers (IEEE) 2021
Online Access:https://hdl.handle.net/1721.1/129496