Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion

Convolutional neural networks (CNNs) have achieved milestones in object detection of synthetic aperture radar (SAR) images. Recently, vision transformers and their variants have shown great promise in detection tasks. However, ship detection in SAR images remains a substantial challenge because of t...

Full description

Bibliographic Details
Main Authors:	Kuoyang Li, Min Zhang, Maiping Xu, Rui Tang, Liang Wang, Hai Wang
Format:	Article
Language:	English
Published:	MDPI AG 2022-07-01
Series:	Remote Sensing
Subjects:	synthetic aperture radar (SAR) ship detection feature enhancement Swin transformer adjacent feature fusion Cascade R-CNN
Online Access:	https://www.mdpi.com/2072-4292/14/13/3186

_version_	1797408243480264704
author	Kuoyang Li Min Zhang Maiping Xu Rui Tang Liang Wang Hai Wang
author_facet	Kuoyang Li Min Zhang Maiping Xu Rui Tang Liang Wang Hai Wang
author_sort	Kuoyang Li
collection	DOAJ
description	Convolutional neural networks (CNNs) have achieved milestones in object detection of synthetic aperture radar (SAR) images. Recently, vision transformers and their variants have shown great promise in detection tasks. However, ship detection in SAR images remains a substantial challenge because of the characteristics of strong scattering, multi-scale, and complex backgrounds of ship objects in SAR images. This paper proposes an enhancement Swin transformer detection network, named ESTDNet, to complete the ship detection in SAR images to solve the above problems. We adopt the Swin transformer of Cascade-R-CNN (Cascade R-CNN Swin) as a benchmark model in ESTDNet. Based on this, we built two modules in ESTDNet: the feature enhancement Swin transformer (FESwin) module for improving feature extraction capability and the adjacent feature fusion (AFF) module for optimizing feature pyramids. Firstly, the FESwin module is employed as the backbone network, aggregating contextual information about perceptions before and after the Swin transformer model using CNN. It uses single-point channel information interaction as the primary and local spatial information interaction as the secondary for scale fusion based on capturing visual dependence through self-attention, which improves spatial-to-channel feature expression and increases the utilization of ship information from SAR images. Secondly, the AFF module is a weighted selection fusion of each high-level feature in the feature pyramid with its adjacent shallow-level features using learnable adaptive weights, allowing the ship information of SAR images to be focused on the feature maps at more scales and improving the recognition and localization capability for ships in SAR images. Finally, the ablation study conducted on the SSDD dataset validates the effectiveness of the two components proposed in the ESTDNet detector. Moreover, the experiments executed on two public datasets consisting of SSDD and SARShip demonstrate that the ESTDNet detector outperforms the state-of-the-art methods, which provides a new idea for ship detection in SAR images.
first_indexed	2024-03-09T03:55:36Z
format	Article
id	doaj.art-6e5b09b036224986b2cfc7e90d4c6453
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-03-09T03:55:36Z
publishDate	2022-07-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-6e5b09b036224986b2cfc7e90d4c64532023-12-03T14:21:07ZengMDPI AGRemote Sensing2072-42922022-07-011413318610.3390/rs14133186Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature FusionKuoyang Li0Min Zhang1Maiping Xu2Rui Tang3Liang Wang4Hai Wang5School of Aerospace Science and Technology, Xidian University, Xi’an 710126, ChinaSchool of Aerospace Science and Technology, Xidian University, Xi’an 710126, ChinaShaanxi Academy of Aerospace Technology Application Co., Ltd., Xi’an 710199, ChinaShaanxi Academy of Aerospace Technology Application Co., Ltd., Xi’an 710199, ChinaShaanxi Academy of Aerospace Technology Application Co., Ltd., Xi’an 710199, ChinaSchool of Aerospace Science and Technology, Xidian University, Xi’an 710126, ChinaConvolutional neural networks (CNNs) have achieved milestones in object detection of synthetic aperture radar (SAR) images. Recently, vision transformers and their variants have shown great promise in detection tasks. However, ship detection in SAR images remains a substantial challenge because of the characteristics of strong scattering, multi-scale, and complex backgrounds of ship objects in SAR images. This paper proposes an enhancement Swin transformer detection network, named ESTDNet, to complete the ship detection in SAR images to solve the above problems. We adopt the Swin transformer of Cascade-R-CNN (Cascade R-CNN Swin) as a benchmark model in ESTDNet. Based on this, we built two modules in ESTDNet: the feature enhancement Swin transformer (FESwin) module for improving feature extraction capability and the adjacent feature fusion (AFF) module for optimizing feature pyramids. Firstly, the FESwin module is employed as the backbone network, aggregating contextual information about perceptions before and after the Swin transformer model using CNN. It uses single-point channel information interaction as the primary and local spatial information interaction as the secondary for scale fusion based on capturing visual dependence through self-attention, which improves spatial-to-channel feature expression and increases the utilization of ship information from SAR images. Secondly, the AFF module is a weighted selection fusion of each high-level feature in the feature pyramid with its adjacent shallow-level features using learnable adaptive weights, allowing the ship information of SAR images to be focused on the feature maps at more scales and improving the recognition and localization capability for ships in SAR images. Finally, the ablation study conducted on the SSDD dataset validates the effectiveness of the two components proposed in the ESTDNet detector. Moreover, the experiments executed on two public datasets consisting of SSDD and SARShip demonstrate that the ESTDNet detector outperforms the state-of-the-art methods, which provides a new idea for ship detection in SAR images.https://www.mdpi.com/2072-4292/14/13/3186synthetic aperture radar (SAR)ship detectionfeature enhancement Swin transformeradjacent feature fusionCascade R-CNN
spellingShingle	Kuoyang Li Min Zhang Maiping Xu Rui Tang Liang Wang Hai Wang Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion Remote Sensing synthetic aperture radar (SAR) ship detection feature enhancement Swin transformer adjacent feature fusion Cascade R-CNN
title	Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion
title_full	Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion
title_fullStr	Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion
title_full_unstemmed	Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion
title_short	Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion
title_sort	ship detection in sar images based on feature enhancement swin transformer and adjacent feature fusion
topic	synthetic aperture radar (SAR) ship detection feature enhancement Swin transformer adjacent feature fusion Cascade R-CNN
url	https://www.mdpi.com/2072-4292/14/13/3186
work_keys_str_mv	AT kuoyangli shipdetectioninsarimagesbasedonfeatureenhancementswintransformerandadjacentfeaturefusion AT minzhang shipdetectioninsarimagesbasedonfeatureenhancementswintransformerandadjacentfeaturefusion AT maipingxu shipdetectioninsarimagesbasedonfeatureenhancementswintransformerandadjacentfeaturefusion AT ruitang shipdetectioninsarimagesbasedonfeatureenhancementswintransformerandadjacentfeaturefusion AT liangwang shipdetectioninsarimagesbasedonfeatureenhancementswintransformerandadjacentfeaturefusion AT haiwang shipdetectioninsarimagesbasedonfeatureenhancementswintransformerandadjacentfeaturefusion

Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion

Similar Items