APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings,...

Disgrifiad llawn

Manylion Llyfryddiaeth
Prif Awduron: Wang, Tianzhe, Wang, Kuan, Cai, Han, Lin, Ji, Liu, Zhijian, Han, Song
Awduron Eraill: Massachusetts Institute of Technology. Microsystems Technology Laboratories
Fformat: Erthygl
Iaith:English
Cyhoeddwyd: Institute of Electrical and Electronics Engineers (IEEE) 2021
Mynediad Ar-lein:https://hdl.handle.net/1721.1/129496