Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images

We study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in...

Full description

Bibliographic Details
Main Authors:	Xiaohu Dong, Yao Qin, Yinghui Gao, Ruigang Fu, Songlin Liu, Yuanxin Ye
Format:	Article
Language:	English
Published:	MDPI AG 2022-08-01
Series:	Remote Sensing
Subjects:	object detection remote sensing deformable convolution multi-level feature fusion attention module
Online Access:	https://www.mdpi.com/2072-4292/14/15/3735

_version_	1797440763277082624
author	Xiaohu Dong Yao Qin Yinghui Gao Ruigang Fu Songlin Liu Yuanxin Ye
author_facet	Xiaohu Dong Yao Qin Yinghui Gao Ruigang Fu Songlin Liu Yuanxin Ye
author_sort	Xiaohu Dong
collection	DOAJ
description	We study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in remote sensing images usually exhibit variable shapes, orientations, and sizes. To this end, we propose a dedicated object detector based on the FPN architecture to achieve accurate object detection in remote sensing images. Specifically, considering the variable shapes and orientations of remote sensing objects, we first replace the original lateral connections of FPN with Deformable Convolution Lateral Connection Modules (DCLCMs), each of which includes a <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3</mn><mo>×</mo><mn>3</mn></mrow></semantics></math></inline-formula> deformable convolution to generate feature maps with deformable receptive fields. Additionally, we further introduce several Attention-based Multi-Level Feature Fusion Modules (A-MLFFMs) to integrate the multi-level outputs of FPN adaptively, further enabling multi-scale object detection. Extensive experimental results on the DIOR dataset demonstrated the state-of-the-art performance achieved by the proposed method, with the highest mean Average Precision (mAP) of 73.6%.
first_indexed	2024-03-09T12:14:03Z
format	Article
id	doaj.art-9c0388cc22c54e46a57c945c34ad4043
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-03-09T12:14:03Z
publishDate	2022-08-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-9c0388cc22c54e46a57c945c34ad40432023-11-30T22:49:25ZengMDPI AGRemote Sensing2072-42922022-08-011415373510.3390/rs14153735Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing ImagesXiaohu Dong0Yao Qin1Yinghui Gao2Ruigang Fu3Songlin Liu4Yuanxin Ye5College of Electronic Science, National University of Defense Technology, Changsha 410073, ChinaRemote Sensing Laboratory, Northwest Institute of Nuclear Technology, Xi’an 710024, ChinaWarfare Studies Institute, Academy of Military Sciences, Beijing 100091, ChinaCollege of Electronic Science, National University of Defense Technology, Changsha 410073, ChinaState Key Laboratory of Geo-Information Engineering, Xi’an 710024, ChinaFaculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu 610031, ChinaWe study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in remote sensing images usually exhibit variable shapes, orientations, and sizes. To this end, we propose a dedicated object detector based on the FPN architecture to achieve accurate object detection in remote sensing images. Specifically, considering the variable shapes and orientations of remote sensing objects, we first replace the original lateral connections of FPN with Deformable Convolution Lateral Connection Modules (DCLCMs), each of which includes a <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3</mn><mo>×</mo><mn>3</mn></mrow></semantics></math></inline-formula> deformable convolution to generate feature maps with deformable receptive fields. Additionally, we further introduce several Attention-based Multi-Level Feature Fusion Modules (A-MLFFMs) to integrate the multi-level outputs of FPN adaptively, further enabling multi-scale object detection. Extensive experimental results on the DIOR dataset demonstrated the state-of-the-art performance achieved by the proposed method, with the highest mean Average Precision (mAP) of 73.6%.https://www.mdpi.com/2072-4292/14/15/3735object detectionremote sensingdeformable convolutionmulti-level feature fusionattention module
spellingShingle	Xiaohu Dong Yao Qin Yinghui Gao Ruigang Fu Songlin Liu Yuanxin Ye Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images Remote Sensing object detection remote sensing deformable convolution multi-level feature fusion attention module
title	Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_full	Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_fullStr	Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_full_unstemmed	Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_short	Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_sort	attention based multi level feature fusion for object detection in remote sensing images
topic	object detection remote sensing deformable convolution multi-level feature fusion attention module
url	https://www.mdpi.com/2072-4292/14/15/3735
work_keys_str_mv	AT xiaohudong attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT yaoqin attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT yinghuigao attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT ruigangfu attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT songlinliu attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT yuanxinye attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages

Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images

Similar Items