Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design
Owing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-09-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/13/10/451 |
_version_ | 1797407140707565568 |
---|---|
author | Seojin Jang Wei Liu Yongbeom Cho |
author_facet | Seojin Jang Wei Liu Yongbeom Cho |
author_sort | Seojin Jang |
collection | DOAJ |
description | Owing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time performance; however, its power consumption ratio is poor owing to high power consumption. By contrast, field-programmable gate arrays (FPGAs) have lower power consumption and flexible architecture, making them more suitable for CNN implementation. In this study, we propose a method that offers both the speed of CNNs and the power and parallelism of FPGAs. This solution relies on two primary acceleration techniques—parallel processing of layer resources and pipelining within specific layers. Moreover, a new method is introduced for exchanging domain requirements for speed and design time by implementing an automatic parallel hardware–software co-design CNN using the software-defined system-on-chip tool. We evaluated the proposed method using five networks—MobileNetV1, ShuffleNetV2, SqueezeNet, ResNet-50, and VGG-16—and FPGA processors—ZCU102. We experimentally demonstrated that our design has a higher speed-up than the conventional implementation method. The proposed method achieves 2.47×, 1.93×, and 2.16× speed-up on the ZCU102 for MobileNetV1, ShuffleNetV2, and SqueezeNet, respectively. |
first_indexed | 2024-03-09T03:36:59Z |
format | Article |
id | doaj.art-ef471c450efc478480e31b56f0a63345 |
institution | Directory Open Access Journal |
issn | 2078-2489 |
language | English |
last_indexed | 2024-03-09T03:36:59Z |
publishDate | 2022-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Information |
spelling | doaj.art-ef471c450efc478480e31b56f0a633452023-12-03T14:47:40ZengMDPI AGInformation2078-24892022-09-01131045110.3390/info13100451Convolutional Neural Network Model Compression Method for Software—Hardware Co-DesignSeojin Jang0Wei Liu1Yongbeom Cho2Department of Electrical and Electronics Engineering, Konkuk University, Seoul 05029, KoreaDeep ET, Seoul 05029, KoreaDepartment of Electrical and Electronics Engineering, Konkuk University, Seoul 05029, KoreaOwing to their high accuracy, deep convolutional neural networks (CNNs) are extensively used. However, they are characterized by high complexity. Real-time performance and acceleration are required in current CNN systems. A graphics processing unit (GPU) is one possible solution to improve real-time performance; however, its power consumption ratio is poor owing to high power consumption. By contrast, field-programmable gate arrays (FPGAs) have lower power consumption and flexible architecture, making them more suitable for CNN implementation. In this study, we propose a method that offers both the speed of CNNs and the power and parallelism of FPGAs. This solution relies on two primary acceleration techniques—parallel processing of layer resources and pipelining within specific layers. Moreover, a new method is introduced for exchanging domain requirements for speed and design time by implementing an automatic parallel hardware–software co-design CNN using the software-defined system-on-chip tool. We evaluated the proposed method using five networks—MobileNetV1, ShuffleNetV2, SqueezeNet, ResNet-50, and VGG-16—and FPGA processors—ZCU102. We experimentally demonstrated that our design has a higher speed-up than the conventional implementation method. The proposed method achieves 2.47×, 1.93×, and 2.16× speed-up on the ZCU102 for MobileNetV1, ShuffleNetV2, and SqueezeNet, respectively.https://www.mdpi.com/2078-2489/13/10/451convolutional neural networkfield-programmable gate arrayhardware–software co-design |
spellingShingle | Seojin Jang Wei Liu Yongbeom Cho Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design Information convolutional neural network field-programmable gate array hardware–software co-design |
title | Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design |
title_full | Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design |
title_fullStr | Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design |
title_full_unstemmed | Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design |
title_short | Convolutional Neural Network Model Compression Method for Software—Hardware Co-Design |
title_sort | convolutional neural network model compression method for software hardware co design |
topic | convolutional neural network field-programmable gate array hardware–software co-design |
url | https://www.mdpi.com/2078-2489/13/10/451 |
work_keys_str_mv | AT seojinjang convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign AT weiliu convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign AT yongbeomcho convolutionalneuralnetworkmodelcompressionmethodforsoftwarehardwarecodesign |