Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
Convolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devic...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8667287/ |
_version_ | 1818603455084756992 |
---|---|
author | Wei Zhou Yue Niu Guanwen Zhang |
author_facet | Wei Zhou Yue Niu Guanwen Zhang |
author_sort | Wei Zhou |
collection | DOAJ |
description | Convolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devices. In this paper, by analyzing the sensitivity of the rank in each layer of the network accuracy, we propose a sensitivity-oriented layer-wise low-rank approximation algorithm. With specific compression and acceleration requirement, the convolutional layer with higher sensitivity keeps more kernels than that with lower sensitivity. In addition, we also demonstrated that global optimization can obtain a better classification performance than layer-wise fine-tuning. The experimental results show that the proposed method can achieve 20% acceleration ratio gaining compared with the traditional rank-reducing methods. When deployed on the VGGNet-16 model, the proposed method can achieve 2.7x compression/acceleration ratio on convolutional layers and 10.9x compression/acceleration ratio on fully connected (FC) layers with 0.05% top-1 accuracy loss and 0.01% top-5 accuracy loss. |
first_indexed | 2024-12-16T13:23:26Z |
format | Article |
id | doaj.art-5c733c7af5d1479bb6b2deeeace432e3 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-16T13:23:26Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-5c733c7af5d1479bb6b2deeeace432e32022-12-21T22:30:17ZengIEEEIEEE Access2169-35362019-01-017382643827210.1109/ACCESS.2019.29051388667287Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural NetworkWei Zhou0https://orcid.org/0000-0001-9715-6957Yue Niu1Guanwen Zhang2School of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaConvolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devices. In this paper, by analyzing the sensitivity of the rank in each layer of the network accuracy, we propose a sensitivity-oriented layer-wise low-rank approximation algorithm. With specific compression and acceleration requirement, the convolutional layer with higher sensitivity keeps more kernels than that with lower sensitivity. In addition, we also demonstrated that global optimization can obtain a better classification performance than layer-wise fine-tuning. The experimental results show that the proposed method can achieve 20% acceleration ratio gaining compared with the traditional rank-reducing methods. When deployed on the VGGNet-16 model, the proposed method can achieve 2.7x compression/acceleration ratio on convolutional layers and 10.9x compression/acceleration ratio on fully connected (FC) layers with 0.05% top-1 accuracy loss and 0.01% top-5 accuracy loss.https://ieeexplore.ieee.org/document/8667287/CNNlayer-wise sensitivitycompressionacceleration |
spellingShingle | Wei Zhou Yue Niu Guanwen Zhang Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network IEEE Access CNN layer-wise sensitivity compression acceleration |
title | Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network |
title_full | Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network |
title_fullStr | Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network |
title_full_unstemmed | Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network |
title_short | Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network |
title_sort | sensitivity oriented layer wise acceleration and compression for convolutional neural network |
topic | CNN layer-wise sensitivity compression acceleration |
url | https://ieeexplore.ieee.org/document/8667287/ |
work_keys_str_mv | AT weizhou sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork AT yueniu sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork AT guanwenzhang sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork |