Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design

Owing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time...

Full description

Bibliographic Details
Main Authors: Seojin Jang, Wei Liu, Yongbeom Cho
Format: Article
Language:English
Published: MDPI AG 2022-09-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/13/10/451
_version_ 1797407140707565568
author Seojin Jang
Wei Liu
Yongbeom Cho
author_facet Seojin Jang
Wei Liu
Yongbeom Cho
author_sort Seojin Jang
collection DOAJ
description Owing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time performance; however, its power consumption ratio is poor owing to high power consumption. By contrast, field-programmable gate arrays (FPGAs) have lower power consumption and flexible architecture, making them more suitable for CNN implementation. In this study, we propose a method that offers both the speed of CNNs and the power and parallelism of FPGAs. This solution relies on two primary acceleration techniques—parallel processing of layer resources and pipelining within specific layers. Moreover, a new method is introduced for exchanging domain requirements for speed and design time by implementing an automatic parallel hardware–software co-design CNN using the software-defined system-on-chip tool. We evaluated the proposed method using five networks—MobileNetV1, ShuffleNetV2, SqueezeNet, ResNet-50, and VGG-16—and FPGA processors—ZCU102. We experimentally demonstrated that our design has a higher speed-up than the conventional implementation method. The proposed method achieves 2.47×, 1.93×, and 2.16× speed-up on the ZCU102 for MobileNetV1, ShuffleNetV2, and SqueezeNet, respectively.
first_indexed 2024-03-09T03:36:59Z
format Article
id doaj.art-ef471c450efc478480e31b56f0a63345
institution Directory Open Access Journal
issn 2078-2489
language English
last_indexed 2024-03-09T03:36:59Z
publishDate 2022-09-01
publisher MDPI AG
record_format Article
series Information
spelling doaj.art-ef471c450efc478480e31b56f0a633452023-12-03T14:47:40ZengMDPI AGInformation2078-24892022-09-01131045110.3390/info13100451Convolutional Neural Network Model Compression Method for Software—Hardware Co-DesignSeojin Jang0Wei Liu1Yongbeom Cho2Department of Electrical and Electronics Engineering, Konkuk University, Seoul 05029, KoreaDeep ET, Seoul 05029, KoreaDepartment of Electrical and Electronics Engineering, Konkuk University, Seoul 05029, KoreaOwing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time performance; however, its power consumption ratio is poor owing to high power consumption. By contrast, field-programmable gate arrays (FPGAs) have lower power consumption and flexible architecture, making them more suitable for CNN implementation. In this study, we propose a method that offers both the speed of CNNs and the power and parallelism of FPGAs. This solution relies on two primary acceleration techniques—parallel processing of layer resources and pipelining within specific layers. Moreover, a new method is introduced for exchanging domain requirements for speed and design time by implementing an automatic parallel hardware–software co-design CNN using the software-defined system-on-chip tool. We evaluated the proposed method using five networks—MobileNetV1, ShuffleNetV2, SqueezeNet, ResNet-50, and VGG-16—and FPGA processors—ZCU102. We experimentally demonstrated that our design has a higher speed-up than the conventional implementation method. The proposed method achieves 2.47×, 1.93×, and 2.16× speed-up on the ZCU102 for MobileNetV1, ShuffleNetV2, and SqueezeNet, respectively.https://www.mdpi.com/2078-2489/13/10/451convolutional neural networkfield-programmable gate arrayhardware–software co-design
spellingShingle Seojin Jang
Wei Liu
Yongbeom Cho
Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
Information
convolutional neural network
field-programmable gate array
hardware–software co-design
title Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_full Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_fullStr Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_full_unstemmed Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_short Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
title_sort convolutional neural network model compression method for software hardware co design
topic convolutional neural network
field-programmable gate array
hardware–software co-design
url https://www.mdpi.com/2078-2489/13/10/451
work_keys_str_mv AT seojinjang convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign
AT weiliu convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign
AT yongbeomcho convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign