Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design

Owing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time...

Full description

Bibliographic Details
Main Authors:	Seojin Jang, Wei Liu, Yongbeom Cho
Format:	Article
Language:	English
Published:	MDPI AG 2022-09-01
Series:	Information
Subjects:	convolutional neural network field-programmable gate array hardware–software co-design
Online Access:	https://www.mdpi.com/2078-2489/13/10/451

_version_	1797407140707565568
author	Seojin Jang Wei Liu Yongbeom Cho
author_facet	Seojin Jang Wei Liu Yongbeom Cho
author_sort	Seojin Jang
collection	DOAJ
description	Owing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time performance; however, its power consumption ratio is poor owing to high power consumption. By contrast, field-programmable gate arrays (FPGAs) have lower power consumption and flexible architecture, making them more suitable for CNN implementation. In this study, we propose a method that offers both the speed of CNNs and the power and parallelism of FPGAs. This solution relies on two primary acceleration techniques—parallel processing of layer resources and pipelining within specific layers. Moreover, a new method is introduced for exchanging domain requirements for speed and design time by implementing an automatic parallel hardware–software co-design CNN using the software-defined system-on-chip tool. We evaluated the proposed method using five networks—MobileNetV1, ShuffleNetV2, SqueezeNet, ResNet-50, and VGG-16—and FPGA processors—ZCU102. We experimentally demonstrated that our design has a higher speed-up than the conventional implementation method. The proposed method achieves 2.47×, 1.93×, and 2.16× speed-up on the ZCU102 for MobileNetV1, ShuffleNetV2, and SqueezeNet, respectively.
first_indexed	2024-03-09T03:36:59Z
format	Article
id	doaj.art-ef471c450efc478480e31b56f0a63345
institution	Directory Open Access Journal
issn	2078-2489
language	English
last_indexed	2024-03-09T03:36:59Z
publishDate	2022-09-01
publisher	MDPI AG
record_format	Article
series	Information
spelling	doaj.art-ef471c450efc478480e31b56f0a633452023-12-03T14:47:40ZengMDPI AGInformation2078-24892022-09-01131045110.3390/info13100451Convolutional Neural Network Model Compression Method for Software—Hardware Co-DesignSeojin Jang0Wei Liu1Yongbeom Cho2Department of Electrical and Electronics Engineering, Konkuk University, Seoul 05029, KoreaDeep ET, Seoul 05029, KoreaDepartment of Electrical and Electronics Engineering, Konkuk University, Seoul 05029, KoreaOwing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time performance; however, its power consumption ratio is poor owing to high power consumption. By contrast, field-programmable gate arrays (FPGAs) have lower power consumption and flexible architecture, making them more suitable for CNN implementation. In this study, we propose a method that offers both the speed of CNNs and the power and parallelism of FPGAs. This solution relies on two primary acceleration techniques—parallel processing of layer resources and pipelining within specific layers. Moreover, a new method is introduced for exchanging domain requirements for speed and design time by implementing an automatic parallel hardware–software co-design CNN using the software-defined system-on-chip tool. We evaluated the proposed method using five networks—MobileNetV1, ShuffleNetV2, SqueezeNet, ResNet-50, and VGG-16—and FPGA processors—ZCU102. We experimentally demonstrated that our design has a higher speed-up than the conventional implementation method. The proposed method achieves 2.47×, 1.93×, and 2.16× speed-up on the ZCU102 for MobileNetV1, ShuffleNetV2, and SqueezeNet, respectively.https://www.mdpi.com/2078-2489/13/10/451convolutional neural networkfield-programmable gate arrayhardware–software co-design
spellingShingle	Seojin Jang Wei Liu Yongbeom Cho Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design Information convolutional neural network field-programmable gate array hardware–software co-design
title	Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_full	Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_fullStr	Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_full_unstemmed	Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_short	Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_sort	convolutional neural network model compression method for software hardware co design
topic	convolutional neural network field-programmable gate array hardware–software co-design
url	https://www.mdpi.com/2078-2489/13/10/451
work_keys_str_mv	AT seojinjang convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign AT weiliu convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign AT yongbeomcho convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign

Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design

Similar Items