Summary: | Computer vision applications are rapidly gaining popularity in embedded systems, which typically involve a difficult tradeoff between vision performance and energy consumption under a constraint of real-time processing throughput. Recently, hardware (FPGA and ASIC-based) implementations have emerged, which significantly improves the energy efficiency of vision computation. These implementations, however, often involve intensive memory traffic that retains a significant portion of energy consumption at the system level. To address this issue, we are the first researchers to present a lossy compression framework to exploit the tradeoff between vision performance and memory traffic for input images. To meet various requirements for memory access patterns in the vision system, a line-to-block format conversion is designed for the framework. Differential pulse-code modulation-based gradient-oriented quantization is developed as the lossy compression algorithm. We also present its hardware design that supports up to 12-scale 1080p@60fps real-time processing. For histogram of oriented gradient-based deformable part models on VOC2007, the proposed framework achieves a 49.6%-60.5% memory traffic reduction at a detection rate degradation of 0.05%-0.34%. For AlexNet on ImageNet, memory traffic reduction achieves up to 60.8% with less than 0.61% classification rate degradation. Compared with the power consumption reduction from memory traffic, the overhead involved for the proposed input image compression is less than 5%.
|