Enhanced mechanisms of pooling and channel attention for deep learning feature maps

The pooling function is vital for deep neural networks (DNNs). The operation is to generalize the representation of feature maps and progressively cut down the spatial size of feature maps to optimize the computing consumption of the network. Furthermore, the function is also the basis for the compu...

Full description

Bibliographic Details
Main Authors:	Hengyi Li, Xuebin Yue, Lin Meng
Format:	Article
Language:	English
Published:	PeerJ Inc. 2022-11-01
Series:	PeerJ Computer Science
Subjects:	DNNs Max pooling Average pooling FMAPooling Self-attention FMAttn
Online Access:	https://peerj.com/articles/cs-1161.pdf

_version_	1811216916409221120
author	Hengyi Li Xuebin Yue Lin Meng
author_facet	Hengyi Li Xuebin Yue Lin Meng
author_sort	Hengyi Li
collection	DOAJ
description	The pooling function is vital for deep neural networks (DNNs). The operation is to generalize the representation of feature maps and progressively cut down the spatial size of feature maps to optimize the computing consumption of the network. Furthermore, the function is also the basis for the computer vision attention mechanism. However, as a matter of fact, pooling is a down-sampling operation, which makes the feature-map representation approximately to small translations with the summary statistic of adjacent pixels. As a result, the function inevitably leads to information loss more or less. In this article, we propose a fused max-average pooling (FMAPooling) operation as well as an improved channel attention mechanism (FMAttn) by utilizing the two pooling functions to enhance the feature representation for DNNs. Basically, the methods are to enhance multiple-level features extracted by max pooling and average pooling respectively. The effectiveness of the proposals is verified with VGG, ResNet, and MobileNetV2 architectures on CIFAR10/100 and ImageNet100. According to the experimental results, the FMAPooling brings up to 1.63% accuracy improvement compared with the baseline model; the FMAttn achieves up to 2.21% accuracy improvement compared with the previous channel attention mechanism. Furthermore, the proposals are extensible and could be embedded into various DNN models easily, or take the place of certain structures of DNNs. The computation burden introduced by the proposals is negligible.
first_indexed	2024-04-12T06:46:35Z
format	Article
id	doaj.art-102d196fab044322b1ccbadf93e577d5
institution	Directory Open Access Journal
issn	2376-5992
language	English
last_indexed	2024-04-12T06:46:35Z
publishDate	2022-11-01
publisher	PeerJ Inc.
record_format	Article
series	PeerJ Computer Science
spelling	doaj.art-102d196fab044322b1ccbadf93e577d52022-12-22T03:43:32ZengPeerJ Inc.PeerJ Computer Science2376-59922022-11-018e116110.7717/peerj-cs.1161Enhanced mechanisms of pooling and channel attention for deep learning feature mapsHengyi Li0Xuebin Yue1Lin Meng2Graduate School of Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, JapanGraduate School of Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, JapanCollege of Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, JapanThe pooling function is vital for deep neural networks (DNNs). The operation is to generalize the representation of feature maps and progressively cut down the spatial size of feature maps to optimize the computing consumption of the network. Furthermore, the function is also the basis for the computer vision attention mechanism. However, as a matter of fact, pooling is a down-sampling operation, which makes the feature-map representation approximately to small translations with the summary statistic of adjacent pixels. As a result, the function inevitably leads to information loss more or less. In this article, we propose a fused max-average pooling (FMAPooling) operation as well as an improved channel attention mechanism (FMAttn) by utilizing the two pooling functions to enhance the feature representation for DNNs. Basically, the methods are to enhance multiple-level features extracted by max pooling and average pooling respectively. The effectiveness of the proposals is verified with VGG, ResNet, and MobileNetV2 architectures on CIFAR10/100 and ImageNet100. According to the experimental results, the FMAPooling brings up to 1.63% accuracy improvement compared with the baseline model; the FMAttn achieves up to 2.21% accuracy improvement compared with the previous channel attention mechanism. Furthermore, the proposals are extensible and could be embedded into various DNN models easily, or take the place of certain structures of DNNs. The computation burden introduced by the proposals is negligible.https://peerj.com/articles/cs-1161.pdfDNNsMax poolingAverage poolingFMAPoolingSelf-attentionFMAttn
spellingShingle	Hengyi Li Xuebin Yue Lin Meng Enhanced mechanisms of pooling and channel attention for deep learning feature maps PeerJ Computer Science DNNs Max pooling Average pooling FMAPooling Self-attention FMAttn
title	Enhanced mechanisms of pooling and channel attention for deep learning feature maps
title_full	Enhanced mechanisms of pooling and channel attention for deep learning feature maps
title_fullStr	Enhanced mechanisms of pooling and channel attention for deep learning feature maps
title_full_unstemmed	Enhanced mechanisms of pooling and channel attention for deep learning feature maps
title_short	Enhanced mechanisms of pooling and channel attention for deep learning feature maps
title_sort	enhanced mechanisms of pooling and channel attention for deep learning feature maps
topic	DNNs Max pooling Average pooling FMAPooling Self-attention FMAttn
url	https://peerj.com/articles/cs-1161.pdf
work_keys_str_mv	AT hengyili enhancedmechanismsofpoolingandchannelattentionfordeeplearningfeaturemaps AT xuebinyue enhancedmechanismsofpoolingandchannelattentionfordeeplearningfeaturemaps AT linmeng enhancedmechanismsofpoolingandchannelattentionfordeeplearningfeaturemaps

Enhanced mechanisms of pooling and channel attention for deep learning feature maps

Similar Items