Improving multispectral pedestrian detection with scale‐aware permutation attention and adjacent feature aggregation

Abstract High quality feature fusion module is one of the key components for multispectral pedestrian detection system in challenging situations, such as large‐scale variance and occlusion. Although attention mechanism is one of the most effective ways for feature refining, the correlation between a...

Full description

Bibliographic Details
Main Authors: Xin Zuo, Zhi Wang, Jifeng Shen, Wankou Yang
Format: Article
Language:English
Published: Wiley 2023-10-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/cvi2.12159
Description
Summary:Abstract High quality feature fusion module is one of the key components for multispectral pedestrian detection system in challenging situations, such as large‐scale variance and occlusion. Although attention mechanism is one of the most effective ways for feature refining, the correlation between attention and scales in feature pyramid still remains unknown. Therefore, a scale‐aware permutated attention module is proposed to enhance features of objects with different scales adaptively in the feature pyramid. Specifically, four different local and global attention sub‐modules are investigated to refine feature maps with different permutations in the Feature Pyramid Networks, improving the quality of the feature fusion. Besides, to address the high miss‐rate issue for small‐sized pedestrians, an adjacent‐branch feature aggregation module is proposed to aggregate features across different scales, taking both semantic context and spatial resolution into consideration. Both modules can benefit from each other with significant performance improvement in terms of efficiency and accuracy, when equipped with the dual‐branch CenterNet detection framework. Experiments on the KAIST and FLIR datasets demonstrate its superior performance compared with other state‐of‐the‐arts.
ISSN:1751-9632
1751-9640