HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection
The goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. Ho...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-05-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/23/9/4508 |
_version_ | 1797601656901206016 |
---|---|
author | Jin Dang Xiaofen Tang Shuai Li |
author_facet | Jin Dang Xiaofen Tang Shuai Li |
author_sort | Jin Dang |
collection | DOAJ |
description | The goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. However, most existing object detection methods recognize objects in isolation, without considering contextual information between objects. Moreover, for the sake of computational efficiency, a significant reduction in the channel dimension may lead to the loss of semantic information. This study explores the utilization of attention mechanisms to augment the representational power and efficiency of features, ultimately improving the accuracy and efficiency of object detection. The study proposed a novel hierarchical attention feature pyramid network (HA-FPN), which comprises two key components: transformer feature pyramid networks (TFPNs) and channel attention modules (CAMs). In TFPNs, multi-scaled convolutional features are embedded as tokens and self-attention is applied to across both the intra- and inter-scales to capture contextual information between the tokens. CAMs are employed to select the channels with rich channel information to alleviate massive channel information losses. By introducing contextual information and attention mechanisms, the HA-FPN significantly improves the accuracy of bounding box detection, leading to more precise identification and localization of target objects. Extensive experiments conducted on the challenging MS COCO dataset demonstrate that the proposed HA-FPN outperforms existing multi-object detection models, while incurring minimal computational overhead. |
first_indexed | 2024-03-11T04:06:43Z |
format | Article |
id | doaj.art-470ba527c8f34c6d93c1a64bbfa6a4a3 |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-11T04:06:43Z |
publishDate | 2023-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-470ba527c8f34c6d93c1a64bbfa6a4a32023-11-17T23:45:22ZengMDPI AGSensors1424-82202023-05-01239450810.3390/s23094508HA-FPN: Hierarchical Attention Feature Pyramid Network for Object DetectionJin Dang0Xiaofen Tang1Shuai Li2School of Information Engineering, Ningxia University, Yinchuan 750021, ChinaSchool of Information Engineering, Ningxia University, Yinchuan 750021, ChinaSchool of Information Engineering, Ningxia University, Yinchuan 750021, ChinaThe goals of object detection are to accurately detect and locate objects of various sizes in digital images. Multi-scale processing technology can improve the detection accuracy of the detector. Feature pyramid networks (FPNs) have been proven to be effective in extracting multi-scaled features. However, most existing object detection methods recognize objects in isolation, without considering contextual information between objects. Moreover, for the sake of computational efficiency, a significant reduction in the channel dimension may lead to the loss of semantic information. This study explores the utilization of attention mechanisms to augment the representational power and efficiency of features, ultimately improving the accuracy and efficiency of object detection. The study proposed a novel hierarchical attention feature pyramid network (HA-FPN), which comprises two key components: transformer feature pyramid networks (TFPNs) and channel attention modules (CAMs). In TFPNs, multi-scaled convolutional features are embedded as tokens and self-attention is applied to across both the intra- and inter-scales to capture contextual information between the tokens. CAMs are employed to select the channels with rich channel information to alleviate massive channel information losses. By introducing contextual information and attention mechanisms, the HA-FPN significantly improves the accuracy of bounding box detection, leading to more precise identification and localization of target objects. Extensive experiments conducted on the challenging MS COCO dataset demonstrate that the proposed HA-FPN outperforms existing multi-object detection models, while incurring minimal computational overhead.https://www.mdpi.com/1424-8220/23/9/4508transformerfeature pyramid networksobject detectionattention modules |
spellingShingle | Jin Dang Xiaofen Tang Shuai Li HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection Sensors transformer feature pyramid networks object detection attention modules |
title | HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection |
title_full | HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection |
title_fullStr | HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection |
title_full_unstemmed | HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection |
title_short | HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection |
title_sort | ha fpn hierarchical attention feature pyramid network for object detection |
topic | transformer feature pyramid networks object detection attention modules |
url | https://www.mdpi.com/1424-8220/23/9/4508 |
work_keys_str_mv | AT jindang hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection AT xiaofentang hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection AT shuaili hafpnhierarchicalattentionfeaturepyramidnetworkforobjectdetection |