Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network

Convolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devic...

Full description

Bibliographic Details
Main Authors:	Wei Zhou, Yue Niu, Guanwen Zhang
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	CNN layer-wise sensitivity compression acceleration
Online Access:	https://ieeexplore.ieee.org/document/8667287/

_version_	1818603455084756992
author	Wei Zhou Yue Niu Guanwen Zhang
author_facet	Wei Zhou Yue Niu Guanwen Zhang
author_sort	Wei Zhou
collection	DOAJ
description	Convolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devices. In this paper, by analyzing the sensitivity of the rank in each layer of the network accuracy, we propose a sensitivity-oriented layer-wise low-rank approximation algorithm. With specific compression and acceleration requirement, the convolutional layer with higher sensitivity keeps more kernels than that with lower sensitivity. In addition, we also demonstrated that global optimization can obtain a better classification performance than layer-wise fine-tuning. The experimental results show that the proposed method can achieve 20% acceleration ratio gaining compared with the traditional rank-reducing methods. When deployed on the VGGNet-16 model, the proposed method can achieve 2.7x compression/acceleration ratio on convolutional layers and 10.9x compression/acceleration ratio on fully connected (FC) layers with 0.05% top-1 accuracy loss and 0.01% top-5 accuracy loss.
first_indexed	2024-12-16T13:23:26Z
format	Article
id	doaj.art-5c733c7af5d1479bb6b2deeeace432e3
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-16T13:23:26Z
publishDate	2019-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-5c733c7af5d1479bb6b2deeeace432e32022-12-21T22:30:17ZengIEEEIEEE Access2169-35362019-01-017382643827210.1109/ACCESS.2019.29051388667287Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural NetworkWei Zhou0https://orcid.org/0000-0001-9715-6957Yue Niu1Guanwen Zhang2School of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaConvolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devices. In this paper, by analyzing the sensitivity of the rank in each layer of the network accuracy, we propose a sensitivity-oriented layer-wise low-rank approximation algorithm. With specific compression and acceleration requirement, the convolutional layer with higher sensitivity keeps more kernels than that with lower sensitivity. In addition, we also demonstrated that global optimization can obtain a better classification performance than layer-wise fine-tuning. The experimental results show that the proposed method can achieve 20% acceleration ratio gaining compared with the traditional rank-reducing methods. When deployed on the VGGNet-16 model, the proposed method can achieve 2.7x compression/acceleration ratio on convolutional layers and 10.9x compression/acceleration ratio on fully connected (FC) layers with 0.05% top-1 accuracy loss and 0.01% top-5 accuracy loss.https://ieeexplore.ieee.org/document/8667287/CNNlayer-wise sensitivitycompressionacceleration
spellingShingle	Wei Zhou Yue Niu Guanwen Zhang Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network IEEE Access CNN layer-wise sensitivity compression acceleration
title	Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_full	Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_fullStr	Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_full_unstemmed	Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_short	Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_sort	sensitivity oriented layer wise acceleration and compression for convolutional neural network
topic	CNN layer-wise sensitivity compression acceleration
url	https://ieeexplore.ieee.org/document/8667287/
work_keys_str_mv	AT weizhou sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork AT yueniu sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork AT guanwenzhang sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork

Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network

Similar Items