Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
We study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-08-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/14/15/3735 |
_version_ | 1797440763277082624 |
---|---|
author | Xiaohu Dong Yao Qin Yinghui Gao Ruigang Fu Songlin Liu Yuanxin Ye |
author_facet | Xiaohu Dong Yao Qin Yinghui Gao Ruigang Fu Songlin Liu Yuanxin Ye |
author_sort | Xiaohu Dong |
collection | DOAJ |
description | We study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in remote sensing images usually exhibit variable shapes, orientations, and sizes. To this end, we propose a dedicated object detector based on the FPN architecture to achieve accurate object detection in remote sensing images. Specifically, considering the variable shapes and orientations of remote sensing objects, we first replace the original lateral connections of FPN with Deformable Convolution Lateral Connection Modules (DCLCMs), each of which includes a <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3</mn><mo>×</mo><mn>3</mn></mrow></semantics></math></inline-formula> deformable convolution to generate feature maps with deformable receptive fields. Additionally, we further introduce several Attention-based Multi-Level Feature Fusion Modules (A-MLFFMs) to integrate the multi-level outputs of FPN adaptively, further enabling multi-scale object detection. Extensive experimental results on the DIOR dataset demonstrated the state-of-the-art performance achieved by the proposed method, with the highest mean Average Precision (mAP) of 73.6%. |
first_indexed | 2024-03-09T12:14:03Z |
format | Article |
id | doaj.art-9c0388cc22c54e46a57c945c34ad4043 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-09T12:14:03Z |
publishDate | 2022-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-9c0388cc22c54e46a57c945c34ad40432023-11-30T22:49:25ZengMDPI AGRemote Sensing2072-42922022-08-011415373510.3390/rs14153735Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing ImagesXiaohu Dong0Yao Qin1Yinghui Gao2Ruigang Fu3Songlin Liu4Yuanxin Ye5College of Electronic Science, National University of Defense Technology, Changsha 410073, ChinaRemote Sensing Laboratory, Northwest Institute of Nuclear Technology, Xi’an 710024, ChinaWarfare Studies Institute, Academy of Military Sciences, Beijing 100091, ChinaCollege of Electronic Science, National University of Defense Technology, Changsha 410073, ChinaState Key Laboratory of Geo-Information Engineering, Xi’an 710024, ChinaFaculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu 610031, ChinaWe study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in remote sensing images usually exhibit variable shapes, orientations, and sizes. To this end, we propose a dedicated object detector based on the FPN architecture to achieve accurate object detection in remote sensing images. Specifically, considering the variable shapes and orientations of remote sensing objects, we first replace the original lateral connections of FPN with Deformable Convolution Lateral Connection Modules (DCLCMs), each of which includes a <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3</mn><mo>×</mo><mn>3</mn></mrow></semantics></math></inline-formula> deformable convolution to generate feature maps with deformable receptive fields. Additionally, we further introduce several Attention-based Multi-Level Feature Fusion Modules (A-MLFFMs) to integrate the multi-level outputs of FPN adaptively, further enabling multi-scale object detection. Extensive experimental results on the DIOR dataset demonstrated the state-of-the-art performance achieved by the proposed method, with the highest mean Average Precision (mAP) of 73.6%.https://www.mdpi.com/2072-4292/14/15/3735object detectionremote sensingdeformable convolutionmulti-level feature fusionattention module |
spellingShingle | Xiaohu Dong Yao Qin Yinghui Gao Ruigang Fu Songlin Liu Yuanxin Ye Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images Remote Sensing object detection remote sensing deformable convolution multi-level feature fusion attention module |
title | Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images |
title_full | Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images |
title_fullStr | Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images |
title_full_unstemmed | Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images |
title_short | Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images |
title_sort | attention based multi level feature fusion for object detection in remote sensing images |
topic | object detection remote sensing deformable convolution multi-level feature fusion attention module |
url | https://www.mdpi.com/2072-4292/14/15/3735 |
work_keys_str_mv | AT xiaohudong attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT yaoqin attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT yinghuigao attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT ruigangfu attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT songlinliu attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages AT yuanxinye attentionbasedmultilevelfeaturefusionforobjectdetectioninremotesensingimages |