Layer-Wise Network Compression Using Gaussian Mixture Model

Due to the large number of parameters and heavy computation, the real-time operation of deep learning in low-performance embedded board is still difficult. Network Pruning is one of effective methods to reduce the number of parameters without additional network structure modification. However, the c...

Full description

Bibliographic Details
Main Authors:	Eunho Lee, Youngbae Hwang
Format:	Article
Language:	English
Published:	MDPI AG 2021-01-01
Series:	Electronics
Subjects:	network pruning network compression Gaussian mixture model
Online Access:	https://www.mdpi.com/2079-9292/10/1/72

Description
Summary:	Due to the large number of parameters and heavy computation, the real-time operation of deep learning in low-performance embedded board is still difficult. Network Pruning is one of effective methods to reduce the number of parameters without additional network structure modification. However, the conventional method prunes redundant parameters up to the same rate for all layers. It may cause a bottleneck problem, which leads to the performance degradation, because the minimum number of optimal parameters is different according to the each layer. We propose a layer adaptive pruning method based on the modeling of weight distribution. We can measure the amount of weights close to zero accurately by applying Gaussian Mixture Model (GMM). Until the target compression rate is reached, the layer selection and pruning are iteratively performed. The layer selection in each iteration considers the timing to reach the target compression rate and the degree of weight pruning. We apply the proposed network compression method for image classification and semantic segmentation to show the effectiveness of the proposed method. In the experiments, the proposed method shows higher compression rate during maintaining the accuracy compared with previous methods.
ISSN:	2079-9292

Layer-Wise Network Compression Using Gaussian Mixture Model

Similar Items