Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network

This paper proposes two max-pooling engines, named the RTB-MAXP engine and the CMB-MAXP engine, with a scalable window size parameter for FPGA-based convolutional neural network (CNN) implementation. The max-pooling operation for the CNN can be decomposed into two stages, i.e., a horizontal axis max...

Full description

Bibliographic Details
Main Authors:	Eonpyo Hong, Kang-A Choi, Jhihoon Joo
Format:	Article
Language:	English
Published:	MDPI AG 2023-09-01
Series:	Electronics
Subjects:	max pooling convolutional neural network (CNN) FPGA rank-tracking-based max pooling (RTB-MAXP) cascaded maximum based max pooling (CMB-MAXP)
Online Access:	https://www.mdpi.com/2079-9292/12/19/4043

_version_	1797575995945910272
author	Eonpyo Hong Kang-A Choi Jhihoon Joo
author_facet	Eonpyo Hong Kang-A Choi Jhihoon Joo
author_sort	Eonpyo Hong
collection	DOAJ
description	This paper proposes two max-pooling engines, named the RTB-MAXP engine and the CMB-MAXP engine, with a scalable window size parameter for FPGA-based convolutional neural network (CNN) implementation. The max-pooling operation for the CNN can be decomposed into two stages, i.e., a horizontal axis max-pooling operation and a vertical axis max-pooling operation. These two one-dimensional max-pooling operations are performed by tracking the rank of the values within the window in the RTB-MAXP engine and cascading the maximum operations of the values in the CMB-MAXP engine. Both the RTB-MAXP engine and the CMB-MAXP engine were implemented using VHSIC hardware description language (VHDL) and verified by simulations. The implementation results demonstrate that the 16 CMB-MAXP engines achieved a remarkable throughput of about 9 GBPS (gigabytes per second) while utilizing only about 3% of the available resources on the Xilinx Virtex UltraScale+ FPGA XCVU9P. On the other hand, the 16 RTB-MAXP engines exhibited somewhat lower throughput and resource utilization, although they did offer a slightly better latency when compared to the CMB-MAXP engines. In the comparison with existing techniques, the CMB-MAXP engine exhibited comparable implementation results in terms of the resource utilization and maximum operating frequency. It is crucial to note that only the proposed engines provide the features of runtime window scalability and boundary padding capability, which are essential requirements for CNN accelerators. The proposed max-pooling engines were employed and tested in our CNN accelerator targeting the CNN model YOLOv4-CSP-S-Leaky for object detection.
first_indexed	2024-03-10T21:46:59Z
format	Article
id	doaj.art-be4ea665cc964fcb97c5b0f0c0f67201
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-10T21:46:59Z
publishDate	2023-09-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-be4ea665cc964fcb97c5b0f0c0f672012023-11-19T14:16:25ZengMDPI AGElectronics2079-92922023-09-011219404310.3390/electronics12194043Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural NetworkEonpyo Hong0Kang-A Choi1Jhihoon Joo2Agency for Defense Development, Yuseong P.O. Box 35, Daejeon 34186, Republic of KoreaAgency for Defense Development, Yuseong P.O. Box 35, Daejeon 34186, Republic of KoreaAgency for Defense Development, Yuseong P.O. Box 35, Daejeon 34186, Republic of KoreaThis paper proposes two max-pooling engines, named the RTB-MAXP engine and the CMB-MAXP engine, with a scalable window size parameter for FPGA-based convolutional neural network (CNN) implementation. The max-pooling operation for the CNN can be decomposed into two stages, i.e., a horizontal axis max-pooling operation and a vertical axis max-pooling operation. These two one-dimensional max-pooling operations are performed by tracking the rank of the values within the window in the RTB-MAXP engine and cascading the maximum operations of the values in the CMB-MAXP engine. Both the RTB-MAXP engine and the CMB-MAXP engine were implemented using VHSIC hardware description language (VHDL) and verified by simulations. The implementation results demonstrate that the 16 CMB-MAXP engines achieved a remarkable throughput of about 9 GBPS (gigabytes per second) while utilizing only about 3% of the available resources on the Xilinx Virtex UltraScale+ FPGA XCVU9P. On the other hand, the 16 RTB-MAXP engines exhibited somewhat lower throughput and resource utilization, although they did offer a slightly better latency when compared to the CMB-MAXP engines. In the comparison with existing techniques, the CMB-MAXP engine exhibited comparable implementation results in terms of the resource utilization and maximum operating frequency. It is crucial to note that only the proposed engines provide the features of runtime window scalability and boundary padding capability, which are essential requirements for CNN accelerators. The proposed max-pooling engines were employed and tested in our CNN accelerator targeting the CNN model YOLOv4-CSP-S-Leaky for object detection.https://www.mdpi.com/2079-9292/12/19/4043max poolingconvolutional neural network (CNN)FPGArank-tracking-based max pooling (RTB-MAXP)cascaded maximum based max pooling (CMB-MAXP)
spellingShingle	Eonpyo Hong Kang-A Choi Jhihoon Joo Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network Electronics max pooling convolutional neural network (CNN) FPGA rank-tracking-based max pooling (RTB-MAXP) cascaded maximum based max pooling (CMB-MAXP)
title	Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network
title_full	Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network
title_fullStr	Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network
title_full_unstemmed	Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network
title_short	Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network
title_sort	efficient two stage max pooling engines for an fpga based convolutional neural network
topic	max pooling convolutional neural network (CNN) FPGA rank-tracking-based max pooling (RTB-MAXP) cascaded maximum based max pooling (CMB-MAXP)
url	https://www.mdpi.com/2079-9292/12/19/4043
work_keys_str_mv	AT eonpyohong efficienttwostagemaxpoolingenginesforanfpgabasedconvolutionalneuralnetwork AT kangachoi efficienttwostagemaxpoolingenginesforanfpgabasedconvolutionalneuralnetwork AT jhihoonjoo efficienttwostagemaxpoolingenginesforanfpgabasedconvolutionalneuralnetwork

Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network

Similar Items