APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings,...
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers (IEEE)
2021
|
Online Access: | https://hdl.handle.net/1721.1/129496 |