High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images

Small object detection in remote sensing enables the identification and analysis of unapparent but important information, playing a crucial role in various ground monitoring tasks. Due to the small size, the available feature information contained in small objects is very limited, making them more e...

Full description

Bibliographic Details
Main Authors: Xiaowen Zhang, Qiaoyuan Liu, Hongliang Chang, Haijiang Sun
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/15/18/4497
_version_ 1827723823458287616
author Xiaowen Zhang
Qiaoyuan Liu
Hongliang Chang
Haijiang Sun
author_facet Xiaowen Zhang
Qiaoyuan Liu
Hongliang Chang
Haijiang Sun
author_sort Xiaowen Zhang
collection DOAJ
description Small object detection in remote sensing enables the identification and analysis of unapparent but important information, playing a crucial role in various ground monitoring tasks. Due to the small size, the available feature information contained in small objects is very limited, making them more easily buried by the complex background. As one of the research hotspots in remote sensing, although many breakthroughs have been made, there still exist two significant shortcomings for the existing approaches: first, the down-sampling operation commonly used for feature extraction can barely preserve weak features of objects in a tiny size; second, the convolutional neural network methods have limitations in modeling global context to address cluttered backgrounds. To tackle these issues, a high-resolution network with transformer embedding parallel detection (HRTP-Net) is proposed in this paper. A high-resolution feature fusion network (HR-FFN) is designed to solve the first problem by maintaining high spatial resolution features with enhanced semantic information. Furthermore, a Swin-transformer-based mixed attention module (STMA) is proposed to augment the object information in the transformer block by establishing a pixel-level correlation, thereby enabling global background–object modeling, which can address the second shortcoming. Finally, a parallel detection structure for remote sensing is constructed by integrating the attentional outputs of STMA with standard convolutional features. The proposed method effectively mitigates the impact of the intricate background on small objects. The comprehensive experiment results on three representative remote sensing datasets with small objects (MASATI, VEDAI and DOTA datasets) demonstrate that the proposed HRTP-Net achieves a promising and competitive performance.
first_indexed 2024-03-10T22:05:15Z
format Article
id doaj.art-3c1db7cb4ceb4440bde26f261bbec259
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T22:05:15Z
publishDate 2023-09-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-3c1db7cb4ceb4440bde26f261bbec2592023-11-19T12:48:32ZengMDPI AGRemote Sensing2072-42922023-09-011518449710.3390/rs15184497High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing ImagesXiaowen Zhang0Qiaoyuan Liu1Hongliang Chang2Haijiang Sun3Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaSmall object detection in remote sensing enables the identification and analysis of unapparent but important information, playing a crucial role in various ground monitoring tasks. Due to the small size, the available feature information contained in small objects is very limited, making them more easily buried by the complex background. As one of the research hotspots in remote sensing, although many breakthroughs have been made, there still exist two significant shortcomings for the existing approaches: first, the down-sampling operation commonly used for feature extraction can barely preserve weak features of objects in a tiny size; second, the convolutional neural network methods have limitations in modeling global context to address cluttered backgrounds. To tackle these issues, a high-resolution network with transformer embedding parallel detection (HRTP-Net) is proposed in this paper. A high-resolution feature fusion network (HR-FFN) is designed to solve the first problem by maintaining high spatial resolution features with enhanced semantic information. Furthermore, a Swin-transformer-based mixed attention module (STMA) is proposed to augment the object information in the transformer block by establishing a pixel-level correlation, thereby enabling global background–object modeling, which can address the second shortcoming. Finally, a parallel detection structure for remote sensing is constructed by integrating the attentional outputs of STMA with standard convolutional features. The proposed method effectively mitigates the impact of the intricate background on small objects. The comprehensive experiment results on three representative remote sensing datasets with small objects (MASATI, VEDAI and DOTA datasets) demonstrate that the proposed HRTP-Net achieves a promising and competitive performance.https://www.mdpi.com/2072-4292/15/18/4497remote sensingobject detectionfeature extractionhigh-resolutionSwin transformer
spellingShingle Xiaowen Zhang
Qiaoyuan Liu
Hongliang Chang
Haijiang Sun
High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images
Remote Sensing
remote sensing
object detection
feature extraction
high-resolution
Swin transformer
title High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images
title_full High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images
title_fullStr High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images
title_full_unstemmed High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images
title_short High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images
title_sort high resolution network with transformer embedding parallel detection for small object detection in optical remote sensing images
topic remote sensing
object detection
feature extraction
high-resolution
Swin transformer
url https://www.mdpi.com/2072-4292/15/18/4497
work_keys_str_mv AT xiaowenzhang highresolutionnetworkwithtransformerembeddingparalleldetectionforsmallobjectdetectioninopticalremotesensingimages
AT qiaoyuanliu highresolutionnetworkwithtransformerembeddingparalleldetectionforsmallobjectdetectioninopticalremotesensingimages
AT hongliangchang highresolutionnetworkwithtransformerembeddingparalleldetectionforsmallobjectdetectioninopticalremotesensingimages
AT haijiangsun highresolutionnetworkwithtransformerembeddingparalleldetectionforsmallobjectdetectioninopticalremotesensingimages