Mirror Descent view for Neural Network quantization
Quantizing large Neural Networks (NN) while maintaining the performance is highly desirable for resource-limited devices due to reduced memory and time complexity. It is usually formulated as a constrained optimization problem and optimized via a modified version of gradient descent. In this work, b...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Journal of Machine Learning Research
2021
|