Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network

Convolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devic...

Full description

Bibliographic Details
Main Authors: Wei Zhou, Yue Niu, Guanwen Zhang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8667287/
_version_ 1818603455084756992
author Wei Zhou
Yue Niu
Guanwen Zhang
author_facet Wei Zhou
Yue Niu
Guanwen Zhang
author_sort Wei Zhou
collection DOAJ
description Convolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devices. In this paper, by analyzing the sensitivity of the rank in each layer of the network accuracy, we propose a sensitivity-oriented layer-wise low-rank approximation algorithm. With specific compression and acceleration requirement, the convolutional layer with higher sensitivity keeps more kernels than that with lower sensitivity. In addition, we also demonstrated that global optimization can obtain a better classification performance than layer-wise fine-tuning. The experimental results show that the proposed method can achieve 20% acceleration ratio gaining compared with the traditional rank-reducing methods. When deployed on the VGGNet-16 model, the proposed method can achieve 2.7x compression/acceleration ratio on convolutional layers and 10.9x compression/acceleration ratio on fully connected (FC) layers with 0.05% top-1 accuracy loss and 0.01% top-5 accuracy loss.
first_indexed 2024-12-16T13:23:26Z
format Article
id doaj.art-5c733c7af5d1479bb6b2deeeace432e3
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-16T13:23:26Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-5c733c7af5d1479bb6b2deeeace432e32022-12-21T22:30:17ZengIEEEIEEE Access2169-35362019-01-017382643827210.1109/ACCESS.2019.29051388667287Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural NetworkWei Zhou0https://orcid.org/0000-0001-9715-6957Yue Niu1Guanwen Zhang2School of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an, ChinaConvolutional neural networks (CNNs) have achieved excellent performance improvement in image processing and other machine learning tasks. However, tremendous computation and memory consumption for most classical CNN models pose a great challenge to the deployment in portable and power-limited devices. In this paper, by analyzing the sensitivity of the rank in each layer of the network accuracy, we propose a sensitivity-oriented layer-wise low-rank approximation algorithm. With specific compression and acceleration requirement, the convolutional layer with higher sensitivity keeps more kernels than that with lower sensitivity. In addition, we also demonstrated that global optimization can obtain a better classification performance than layer-wise fine-tuning. The experimental results show that the proposed method can achieve 20% acceleration ratio gaining compared with the traditional rank-reducing methods. When deployed on the VGGNet-16 model, the proposed method can achieve 2.7x compression/acceleration ratio on convolutional layers and 10.9x compression/acceleration ratio on fully connected (FC) layers with 0.05% top-1 accuracy loss and 0.01% top-5 accuracy loss.https://ieeexplore.ieee.org/document/8667287/CNNlayer-wise sensitivitycompressionacceleration
spellingShingle Wei Zhou
Yue Niu
Guanwen Zhang
Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
IEEE Access
CNN
layer-wise sensitivity
compression
acceleration
title Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_full Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_fullStr Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_full_unstemmed Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_short Sensitivity-Oriented Layer-Wise Acceleration and Compression for Convolutional Neural Network
title_sort sensitivity oriented layer wise acceleration and compression for convolutional neural network
topic CNN
layer-wise sensitivity
compression
acceleration
url https://ieeexplore.ieee.org/document/8667287/
work_keys_str_mv AT weizhou sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork
AT yueniu sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork
AT guanwenzhang sensitivityorientedlayerwiseaccelerationandcompressionforconvolutionalneuralnetwork