Complexity-Aware Layer-Wise Mixed-Precision Schemes With SQNR-Based Fast Analysis
Recently, deep neural network (DNN) acceleration has been critical for hardware systems from mobile/edge devices to high-performance data centers. Especially, for on-device AI, there have been many studies on hardware numerical precision reduction considering the limited hardware resources of mobile...
Main Authors: | Hana Kim, Hyun Eun, Jung Hwan Choi, Ji-Hoon Kim |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10287357/ |
Similar Items
-
Forward Adaptive Dual-Mode Quantizer Based on the First-Degree Spline Approximation and Embedded G.711 Codec
by: Z. Peric, et al.
Published: (2019-12-01) -
O-2A: Outlier-Aware Compression for 8-bit Post-Training Quantization Model
by: Nguyen-Dong Ho, et al.
Published: (2023-01-01) -
Novel Oversampling Technique for Improving Signal-to-Quantization Noise Ratio on Accelerometer-Based Smart Jerk Sensors in CNC Applications
by: Eduardo Cabal-Yepez, et al.
Published: (2009-05-01) -
A privacy protection approach in edge-computing based on maximized dnn partition strategy with energy saving
by: Guo Chaopeng, et al.
Published: (2023-03-01) -
Differentiable Neural Architecture, Mixed Precision and Accelerator Co-Search
by: Krishna Teja Chitty-Venkata, et al.
Published: (2023-01-01)