Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers

Abstract Deep learning has shown excellent performance in numerous machine-learning tasks, but one practical obstacle in deep learning is that the amount of computation and required memory is huge. Model compression, especially in deep learning, is very useful because it saves memory and reduces sto...

Full description

Bibliographic Details
Main Authors:	Kichun Lee, Sunghun Hwangbo, Dongwook Yang, Geonseok Lee
Format:	Article
Language:	English
Published:	Springer 2023-02-01
Series:	International Journal of Computational Intelligence Systems
Subjects:	Network compression Weight pruning Non-convex optimization Parallel computing
Online Access:	https://doi.org/10.1007/s44196-023-00202-z

_version_	1797863639943741440
author	Kichun Lee Sunghun Hwangbo Dongwook Yang Geonseok Lee
author_facet	Kichun Lee Sunghun Hwangbo Dongwook Yang Geonseok Lee
author_sort	Kichun Lee
collection	DOAJ
description	Abstract Deep learning has shown excellent performance in numerous machine-learning tasks, but one practical obstacle in deep learning is that the amount of computation and required memory is huge. Model compression, especially in deep learning, is very useful because it saves memory and reduces storage size while maintaining model performance. Model compression in a layered network structure aims to reduce the number of edges by pruning weights that are deemed unnecessary during the calculation. However, existing weight pruning methods perform a layer-by-layer reduction, which requires a predefined removal-ratio constraint for each layer. Layer-by-layer removal ratios must be structurally specified depending on the task, causing a sharp increase in the training time due to a large number of tuning parameters. Thus, such a layer-by-layer strategy is hardly feasible for deep layered models. Our proposed method aims to perform weight pruning in a deep layered network, while producing similar performance, by setting a global removal ratio for the entire model without prior knowledge of the structural characteristics. Our experiments with the proposed method show reliable and high-quality performance, obviating layer-by-layer removal ratios. Furthermore, experiments with increasing layers yield a pattern in the pruned weights that could provide an insight into the layers’ structural importance. The experiment with the LeNet-5 model using MNIST data results in a higher compression ratio of 98.8% for the proposed method, outperforming existing pruning algorithms. In the Resnet-56 experiment, the performance change according to removal ratios of 10–90% is investigated, and a higher removal ratio is achieved compared to other tested models. We also demonstrate the effectiveness of the proposed method with YOLOv4, a real-life object-detection model requiring substantial computation.
first_indexed	2024-04-09T22:38:46Z
format	Article
id	doaj.art-ec1b20bd0e7b4769a289ef65293aaf15
institution	Directory Open Access Journal
issn	1875-6883
language	English
last_indexed	2024-04-09T22:38:46Z
publishDate	2023-02-01
publisher	Springer
record_format	Article
series	International Journal of Computational Intelligence Systems
spelling	doaj.art-ec1b20bd0e7b4769a289ef65293aaf152023-03-22T12:19:57ZengSpringerInternational Journal of Computational Intelligence Systems1875-68832023-02-0116111310.1007/s44196-023-00202-zCompression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of MultipliersKichun Lee0Sunghun Hwangbo1Dongwook Yang2Geonseok Lee3Department of Industrial Engineering, Hanyang UniversityDepartment of Industrial Engineering, Hanyang UniversityDepartment of Industrial Engineering, Hanyang UniversityDepartment of Industrial Engineering, Hanyang UniversityAbstract Deep learning has shown excellent performance in numerous machine-learning tasks, but one practical obstacle in deep learning is that the amount of computation and required memory is huge. Model compression, especially in deep learning, is very useful because it saves memory and reduces storage size while maintaining model performance. Model compression in a layered network structure aims to reduce the number of edges by pruning weights that are deemed unnecessary during the calculation. However, existing weight pruning methods perform a layer-by-layer reduction, which requires a predefined removal-ratio constraint for each layer. Layer-by-layer removal ratios must be structurally specified depending on the task, causing a sharp increase in the training time due to a large number of tuning parameters. Thus, such a layer-by-layer strategy is hardly feasible for deep layered models. Our proposed method aims to perform weight pruning in a deep layered network, while producing similar performance, by setting a global removal ratio for the entire model without prior knowledge of the structural characteristics. Our experiments with the proposed method show reliable and high-quality performance, obviating layer-by-layer removal ratios. Furthermore, experiments with increasing layers yield a pattern in the pruned weights that could provide an insight into the layers’ structural importance. The experiment with the LeNet-5 model using MNIST data results in a higher compression ratio of 98.8% for the proposed method, outperforming existing pruning algorithms. In the Resnet-56 experiment, the performance change according to removal ratios of 10–90% is investigated, and a higher removal ratio is achieved compared to other tested models. We also demonstrate the effectiveness of the proposed method with YOLOv4, a real-life object-detection model requiring substantial computation.https://doi.org/10.1007/s44196-023-00202-zNetwork compressionWeight pruningNon-convex optimizationParallel computing
spellingShingle	Kichun Lee Sunghun Hwangbo Dongwook Yang Geonseok Lee Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers International Journal of Computational Intelligence Systems Network compression Weight pruning Non-convex optimization Parallel computing
title	Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers
title_full	Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers
title_fullStr	Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers
title_full_unstemmed	Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers
title_short	Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers
title_sort	compression of deep learning models through global weight pruning using alternating direction method of multipliers
topic	Network compression Weight pruning Non-convex optimization Parallel computing
url	https://doi.org/10.1007/s44196-023-00202-z
work_keys_str_mv	AT kichunlee compressionofdeeplearningmodelsthroughglobalweightpruningusingalternatingdirectionmethodofmultipliers AT sunghunhwangbo compressionofdeeplearningmodelsthroughglobalweightpruningusingalternatingdirectionmethodofmultipliers AT dongwookyang compressionofdeeplearningmodelsthroughglobalweightpruningusingalternatingdirectionmethodofmultipliers AT geonseoklee compressionofdeeplearningmodelsthroughglobalweightpruningusingalternatingdirectionmethodofmultipliers

Compression of Deep-Learning Models Through Global Weight Pruning Using Alternating Direction Method of Multipliers

Similar Items