Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images

We study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in...

Full description

Bibliographic Details
Main Authors: Xiaohu Dong, Yao Qin, Yinghui Gao, Ruigang Fu, Songlin Liu, Yuanxin Ye
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/15/3735
_version_ 1797440763277082624
author Xiaohu Dong
Yao Qin
Yinghui Gao
Ruigang Fu
Songlin Liu
Yuanxin Ye
author_facet Xiaohu Dong
Yao Qin
Yinghui Gao
Ruigang Fu
Songlin Liu
Yuanxin Ye
author_sort Xiaohu Dong
collection DOAJ
description We study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in remote sensing images usually exhibit variable shapes, orientations, and sizes. To this end, we propose a dedicated object detector based on the FPN architecture to achieve accurate object detection in remote sensing images. Specifically, considering the variable shapes and orientations of remote sensing objects, we first replace the original lateral connections of FPN with Deformable Convolution Lateral Connection Modules (DCLCMs), each of which includes a <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3</mn><mo>×</mo><mn>3</mn></mrow></semantics></math></inline-formula> deformable convolution to generate feature maps with deformable receptive fields. Additionally, we further introduce several Attention-based Multi-Level Feature Fusion Modules (A-MLFFMs) to integrate the multi-level outputs of FPN adaptively, further enabling multi-scale object detection. Extensive experimental results on the DIOR dataset demonstrated the state-of-the-art performance achieved by the proposed method, with the highest mean Average Precision (mAP) of 73.6%.
first_indexed 2024-03-09T12:14:03Z
format Article
id doaj.art-9c0388cc22c54e46a57c945c34ad4043
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-09T12:14:03Z
publishDate 2022-08-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-9c0388cc22c54e46a57c945c34ad40432023-11-30T22:49:25ZengMDPI AGRemote Sensing2072-42922022-08-011415373510.3390/rs14153735Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing ImagesXiaohu Dong0Yao Qin1Yinghui Gao2Ruigang Fu3Songlin Liu4Yuanxin Ye5College of Electronic Science, National University of Defense Technology, Changsha 410073, ChinaRemote Sensing Laboratory, Northwest Institute of Nuclear Technology, Xi’an 710024, ChinaWarfare Studies Institute, Academy of Military Sciences, Beijing 100091, ChinaCollege of Electronic Science, National University of Defense Technology, Changsha 410073, ChinaState Key Laboratory of Geo-Information Engineering, Xi’an 710024, ChinaFaculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu 610031, ChinaWe study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in remote sensing images usually exhibit variable shapes, orientations, and sizes. To this end, we propose a dedicated object detector based on the FPN architecture to achieve accurate object detection in remote sensing images. Specifically, considering the variable shapes and orientations of remote sensing objects, we first replace the original lateral connections of FPN with Deformable Convolution Lateral Connection Modules (DCLCMs), each of which includes a <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3</mn><mo>×</mo><mn>3</mn></mrow></semantics></math></inline-formula> deformable convolution to generate feature maps with deformable receptive fields. Additionally, we further introduce several Attention-based Multi-Level Feature Fusion Modules (A-MLFFMs) to integrate the multi-level outputs of FPN adaptively, further enabling multi-scale object detection. Extensive experimental results on the DIOR dataset demonstrated the state-of-the-art performance achieved by the proposed method, with the highest mean Average Precision (mAP) of 73.6%.https://www.mdpi.com/2072-4292/14/15/3735object detectionremote sensingdeformable convolutionmulti-level feature fusionattention module
spellingShingle Xiaohu Dong
Yao Qin
Yinghui Gao
Ruigang Fu
Songlin Liu
Yuanxin Ye
Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
Remote Sensing
object detection
remote sensing
deformable convolution
multi-level feature fusion
attention module
title Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_full Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_fullStr Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_full_unstemmed Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_short Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
title_sort attention based multi level feature fusion for object detection in remote sensing images
topic object detection
remote sensing
deformable convolution
multi-level feature fusion
attention module
url https://www.mdpi.com/2072-4292/14/15/3735
work_keys_str_mv AT xiaohudong attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages
AT yaoqin attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages
AT yinghuigao attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages
AT ruigangfu attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages
AT songlinliu attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages
AT yuanxinye attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages