Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte Carlo

Structural network pruning is an effective way to reduce network size for deploying deep networks to resource-constrained devices. Existing methods mainly employ knowledge distillation from the last layer of network to guide pruning of the whole network, and informative features from intermediate la...

Full description

Bibliographic Details
Main Authors:	Huidong Liu, Fang Du, Lijuan Song, Zhenhua Yu
Format:	Article
Language:	English
Published:	MDPI AG 2022-10-01
Series:	Applied Sciences
Subjects:	network pruning knowledge distillation Markov Chain Monte Carlo
Online Access:	https://www.mdpi.com/2076-3417/12/21/10952

_version_	1827647403674566656
author	Huidong Liu Fang Du Lijuan Song Zhenhua Yu
author_facet	Huidong Liu Fang Du Lijuan Song Zhenhua Yu
author_sort	Huidong Liu
collection	DOAJ
description	Structural network pruning is an effective way to reduce network size for deploying deep networks to resource-constrained devices. Existing methods mainly employ knowledge distillation from the last layer of network to guide pruning of the whole network, and informative features from intermediate layers are not yet fully exploited to improve pruning efficiency and accuracy. In this paper, we propose a block-wisely supervised network pruning (BNP) approach to find the optimal subnet from a baseline network based on knowledge distillation and Markov Chain Monte Carlo. To achieve this, the baseline network is divided into small blocks, and block shrinkage can be independently applied to each block under a same manner. Specifically, block-wise representations of the baseline network are exploited to supervise subnet search by encouraging each block of student network to imitate the behavior of the corresponding baseline block. A score metric measuring block accuracy and efficiency is assigned to each block, and block search is conducted under a Markov Chain Monte Carlo scheme to sample blocks from the posterior. Knowledge distillation enables effective feature representations of the student network, and Markov Chain Monte Carlo provides a sampling scheme to find the optimal solution. Extensive evaluations on multiple network architectures and datasets show BNP outperforms the state of the art. For instance, with 0.16% accuracy improvement on the CIFAR-10 dataset, it yields a more compact subnet of ResNet-110 than other methods by reducing 61.24% FLOPs.
first_indexed	2024-03-09T19:18:38Z
format	Article
id	doaj.art-facfada808c54975b250242aa8f1c5f5
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-09T19:18:38Z
publishDate	2022-10-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-facfada808c54975b250242aa8f1c5f52023-11-24T03:35:34ZengMDPI AGApplied Sciences2076-34172022-10-0112211095210.3390/app122110952Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte CarloHuidong Liu0Fang Du1Lijuan Song2Zhenhua Yu3School of Information Engineering, Ningxia University, Yinchuan 750021, ChinaSchool of Information Engineering, Ningxia University, Yinchuan 750021, ChinaSchool of Information Engineering, Ningxia University, Yinchuan 750021, ChinaSchool of Information Engineering, Ningxia University, Yinchuan 750021, ChinaStructural network pruning is an effective way to reduce network size for deploying deep networks to resource-constrained devices. Existing methods mainly employ knowledge distillation from the last layer of network to guide pruning of the whole network, and informative features from intermediate layers are not yet fully exploited to improve pruning efficiency and accuracy. In this paper, we propose a block-wisely supervised network pruning (BNP) approach to find the optimal subnet from a baseline network based on knowledge distillation and Markov Chain Monte Carlo. To achieve this, the baseline network is divided into small blocks, and block shrinkage can be independently applied to each block under a same manner. Specifically, block-wise representations of the baseline network are exploited to supervise subnet search by encouraging each block of student network to imitate the behavior of the corresponding baseline block. A score metric measuring block accuracy and efficiency is assigned to each block, and block search is conducted under a Markov Chain Monte Carlo scheme to sample blocks from the posterior. Knowledge distillation enables effective feature representations of the student network, and Markov Chain Monte Carlo provides a sampling scheme to find the optimal solution. Extensive evaluations on multiple network architectures and datasets show BNP outperforms the state of the art. For instance, with 0.16% accuracy improvement on the CIFAR-10 dataset, it yields a more compact subnet of ResNet-110 than other methods by reducing 61.24% FLOPs.https://www.mdpi.com/2076-3417/12/21/10952network pruningknowledge distillationMarkov Chain Monte Carlo
spellingShingle	Huidong Liu Fang Du Lijuan Song Zhenhua Yu Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte Carlo Applied Sciences network pruning knowledge distillation Markov Chain Monte Carlo
title	Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte Carlo
title_full	Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte Carlo
title_fullStr	Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte Carlo
title_full_unstemmed	Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte Carlo
title_short	Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte Carlo
title_sort	block wisely supervised network pruning with knowledge distillation and markov chain monte carlo
topic	network pruning knowledge distillation Markov Chain Monte Carlo
url	https://www.mdpi.com/2076-3417/12/21/10952
work_keys_str_mv	AT huidongliu blockwiselysupervisednetworkpruningwithknowledgedistillationandmarkovchainmontecarlo AT fangdu blockwiselysupervisednetworkpruningwithknowledgedistillationandmarkovchainmontecarlo AT lijuansong blockwiselysupervisednetworkpruningwithknowledgedistillationandmarkovchainmontecarlo AT zhenhuayu blockwiselysupervisednetworkpruningwithknowledgedistillationandmarkovchainmontecarlo

Block-Wisely Supervised Network Pruning with Knowledge Distillation and Markov Chain Monte Carlo

Similar Items