HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection

The goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. Ho...

Full description

Bibliographic Details
Main Authors: Jin Dang, Xiaofen Tang, Shuai Li
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/9/4508
_version_ 1797601656901206016
author Jin Dang
Xiaofen Tang
Shuai Li
author_facet Jin Dang
Xiaofen Tang
Shuai Li
author_sort Jin Dang
collection DOAJ
description The goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. However, most existing object detection methods recognize objects in isolation, without considering contextual information between objects. Moreover, for the sake of computational efficiency, a significant reduction in the channel dimension may lead to the loss of semantic information. This study explores the utilization of attention mechanisms to augment the representational power and efficiency of features, ultimately improving the accuracy and efficiency of object detection. The study proposed a novel hierarchical attention feature pyramid network (HA-FPN), which comprises two key components: transformer feature pyramid networks (TFPNs) and channel attention modules (CAMs). In TFPNs, multi-scaled convolutional features are embedded as tokens and self-attention is applied to across both the intra- and inter-scales to capture contextual information between the tokens. CAMs are employed to select the channels with rich channel information to alleviate massive channel information losses. By introducing contextual information and attention mechanisms, the HA-FPN significantly improves the accuracy of bounding box detection, leading to more precise identification and localization of target objects. Extensive experiments conducted on the challenging MS COCO dataset demonstrate that the proposed HA-FPN outperforms existing multi-object detection models, while incurring minimal computational overhead.
first_indexed 2024-03-11T04:06:43Z
format Article
id doaj.art-470ba527c8f34c6d93c1a64bbfa6a4a3
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-11T04:06:43Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-470ba527c8f34c6d93c1a64bbfa6a4a32023-11-17T23:45:22ZengMDPI AGSensors1424-82202023-05-01239450810.3390/s23094508HA-FPN: Hierarchical Attention Feature Pyramid Network for Object DetectionJin Dang0Xiaofen Tang1Shuai Li2School of Information Engineering, Ningxia University, Yinchuan 750021, ChinaSchool of Information Engineering, Ningxia University, Yinchuan 750021, ChinaSchool of Information Engineering, Ningxia University, Yinchuan 750021, ChinaThe goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. However, most existing object detection methods recognize objects in isolation, without considering contextual information between objects. Moreover, for the sake of computational efficiency, a significant reduction in the channel dimension may lead to the loss of semantic information. This study explores the utilization of attention mechanisms to augment the representational power and efficiency of features, ultimately improving the accuracy and efficiency of object detection. The study proposed a novel hierarchical attention feature pyramid network (HA-FPN), which comprises two key components: transformer feature pyramid networks (TFPNs) and channel attention modules (CAMs). In TFPNs, multi-scaled convolutional features are embedded as tokens and self-attention is applied to across both the intra- and inter-scales to capture contextual information between the tokens. CAMs are employed to select the channels with rich channel information to alleviate massive channel information losses. By introducing contextual information and attention mechanisms, the HA-FPN significantly improves the accuracy of bounding box detection, leading to more precise identification and localization of target objects. Extensive experiments conducted on the challenging MS COCO dataset demonstrate that the proposed HA-FPN outperforms existing multi-object detection models, while incurring minimal computational overhead.https://www.mdpi.com/1424-8220/23/9/4508transformerfeature pyramid networksobject detectionattention modules
spellingShingle Jin Dang
Xiaofen Tang
Shuai Li
HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
Sensors
transformer
feature pyramid networks
object detection
attention modules
title HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_full HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_fullStr HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_full_unstemmed HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_short HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
title_sort ha fpn hierarchical attention feature pyramid network for object detection
topic transformer
feature pyramid networks
object detection
attention modules
url https://www.mdpi.com/1424-8220/23/9/4508
work_keys_str_mv AT jindang hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection
AT xiaofentang hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection
AT shuaili hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection