High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images
Small object detection in remote sensing enables the identification and analysis of unapparent but important information, playing a crucial role in various ground monitoring tasks. Due to the small size, the available feature information contained in small objects is very limited, making them more e...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-09-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/15/18/4497 |
_version_ | 1827723823458287616 |
---|---|
author | Xiaowen Zhang Qiaoyuan Liu Hongliang Chang Haijiang Sun |
author_facet | Xiaowen Zhang Qiaoyuan Liu Hongliang Chang Haijiang Sun |
author_sort | Xiaowen Zhang |
collection | DOAJ |
description | Small object detection in remote sensing enables the identification and analysis of unapparent but important information, playing a crucial role in various ground monitoring tasks. Due to the small size, the available feature information contained in small objects is very limited, making them more easily buried by the complex background. As one of the research hotspots in remote sensing, although many breakthroughs have been made, there still exist two significant shortcomings for the existing approaches: first, the down-sampling operation commonly used for feature extraction can barely preserve weak features of objects in a tiny size; second, the convolutional neural network methods have limitations in modeling global context to address cluttered backgrounds. To tackle these issues, a high-resolution network with transformer embedding parallel detection (HRTP-Net) is proposed in this paper. A high-resolution feature fusion network (HR-FFN) is designed to solve the first problem by maintaining high spatial resolution features with enhanced semantic information. Furthermore, a Swin-transformer-based mixed attention module (STMA) is proposed to augment the object information in the transformer block by establishing a pixel-level correlation, thereby enabling global background–object modeling, which can address the second shortcoming. Finally, a parallel detection structure for remote sensing is constructed by integrating the attentional outputs of STMA with standard convolutional features. The proposed method effectively mitigates the impact of the intricate background on small objects. The comprehensive experiment results on three representative remote sensing datasets with small objects (MASATI, VEDAI and DOTA datasets) demonstrate that the proposed HRTP-Net achieves a promising and competitive performance. |
first_indexed | 2024-03-10T22:05:15Z |
format | Article |
id | doaj.art-3c1db7cb4ceb4440bde26f261bbec259 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-10T22:05:15Z |
publishDate | 2023-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-3c1db7cb4ceb4440bde26f261bbec2592023-11-19T12:48:32ZengMDPI AGRemote Sensing2072-42922023-09-011518449710.3390/rs15184497High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing ImagesXiaowen Zhang0Qiaoyuan Liu1Hongliang Chang2Haijiang Sun3Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaSmall object detection in remote sensing enables the identification and analysis of unapparent but important information, playing a crucial role in various ground monitoring tasks. Due to the small size, the available feature information contained in small objects is very limited, making them more easily buried by the complex background. As one of the research hotspots in remote sensing, although many breakthroughs have been made, there still exist two significant shortcomings for the existing approaches: first, the down-sampling operation commonly used for feature extraction can barely preserve weak features of objects in a tiny size; second, the convolutional neural network methods have limitations in modeling global context to address cluttered backgrounds. To tackle these issues, a high-resolution network with transformer embedding parallel detection (HRTP-Net) is proposed in this paper. A high-resolution feature fusion network (HR-FFN) is designed to solve the first problem by maintaining high spatial resolution features with enhanced semantic information. Furthermore, a Swin-transformer-based mixed attention module (STMA) is proposed to augment the object information in the transformer block by establishing a pixel-level correlation, thereby enabling global background–object modeling, which can address the second shortcoming. Finally, a parallel detection structure for remote sensing is constructed by integrating the attentional outputs of STMA with standard convolutional features. The proposed method effectively mitigates the impact of the intricate background on small objects. The comprehensive experiment results on three representative remote sensing datasets with small objects (MASATI, VEDAI and DOTA datasets) demonstrate that the proposed HRTP-Net achieves a promising and competitive performance.https://www.mdpi.com/2072-4292/15/18/4497remote sensingobject detectionfeature extractionhigh-resolutionSwin transformer |
spellingShingle | Xiaowen Zhang Qiaoyuan Liu Hongliang Chang Haijiang Sun High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images Remote Sensing remote sensing object detection feature extraction high-resolution Swin transformer |
title | High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images |
title_full | High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images |
title_fullStr | High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images |
title_full_unstemmed | High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images |
title_short | High-Resolution Network with Transformer Embedding Parallel Detection for Small Object Detection in Optical Remote Sensing Images |
title_sort | high resolution network with transformer embedding parallel detection for small object detection in optical remote sensing images |
topic | remote sensing object detection feature extraction high-resolution Swin transformer |
url | https://www.mdpi.com/2072-4292/15/18/4497 |
work_keys_str_mv | AT xiaowenzhang highresolutionnetworkwithtransformerembeddingparalleldetectionforsmallobjectdetectioninopticalremotesensingimages AT qiaoyuanliu highresolutionnetworkwithtransformerembeddingparalleldetectionforsmallobjectdetectioninopticalremotesensingimages AT hongliangchang highresolutionnetworkwithtransformerembeddingparalleldetectionforsmallobjectdetectioninopticalremotesensingimages AT haijiangsun highresolutionnetworkwithtransformerembeddingparalleldetectionforsmallobjectdetectioninopticalremotesensingimages |