DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
Semantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for c...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-10-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/14/21/5399 |
_version_ | 1797466632465940480 |
---|---|
author | Wenxu Shi Qingyan Meng Linlin Zhang Maofan Zhao Chen Su Tamás Jancsó |
author_facet | Wenxu Shi Qingyan Meng Linlin Zhang Maofan Zhao Chen Su Tamás Jancsó |
author_sort | Wenxu Shi |
collection | DOAJ |
description | Semantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for current complex networks to handle. Efficient networks are the key to solving this challenge. Many previous works aimed at designing lightweight networks or utilizing pruning and knowledge distillation methods to obtain efficient networks, but these methods inevitably reduce the ability of the resulting models to characterize spatial and semantic features. We propose an effective deep supervision-based simple attention network (DSANet) with spatial and semantic enhancement losses to handle these problems. In the network, (1) a lightweight architecture is used as the backbone; (2) deep supervision modules with improved multiscale spatial detail (MSD) and hierarchical semantic enhancement (HSE) losses synergistically strengthen the obtained feature representations; and (3) a simple embedding attention module (EAM) with linear complexity performs long-range relationship modeling. Experiments conducted on two public RSI datasets (the ISPRS Potsdam dataset and Vaihingen dataset) exhibit the substantial advantages of the proposed approach. Our method achieves 79.19% mean intersection over union (mIoU) on the ISPRS Potsdam test set and 72.26% mIoU on the Vaihingen test set with speeds of 470.07 FPS on 512 × 512 images and 5.46 FPS on 6000 × 6000 images using an RTX 3090 GPU. |
first_indexed | 2024-03-09T18:42:26Z |
format | Article |
id | doaj.art-ce763e4f87254a9db46109d42d70f540 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-09T18:42:26Z |
publishDate | 2022-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-ce763e4f87254a9db46109d42d70f5402023-11-24T06:38:24ZengMDPI AGRemote Sensing2072-42922022-10-011421539910.3390/rs14215399DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing ImageryWenxu Shi0Qingyan Meng1Linlin Zhang2Maofan Zhao3Chen Su4Tamás Jancsó5Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAlba Regia Technical Faculty, Obuda University, Budai ut 45, 8001 Szekesfehervar, HungarySemantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for current complex networks to handle. Efficient networks are the key to solving this challenge. Many previous works aimed at designing lightweight networks or utilizing pruning and knowledge distillation methods to obtain efficient networks, but these methods inevitably reduce the ability of the resulting models to characterize spatial and semantic features. We propose an effective deep supervision-based simple attention network (DSANet) with spatial and semantic enhancement losses to handle these problems. In the network, (1) a lightweight architecture is used as the backbone; (2) deep supervision modules with improved multiscale spatial detail (MSD) and hierarchical semantic enhancement (HSE) losses synergistically strengthen the obtained feature representations; and (3) a simple embedding attention module (EAM) with linear complexity performs long-range relationship modeling. Experiments conducted on two public RSI datasets (the ISPRS Potsdam dataset and Vaihingen dataset) exhibit the substantial advantages of the proposed approach. Our method achieves 79.19% mean intersection over union (mIoU) on the ISPRS Potsdam test set and 72.26% mIoU on the Vaihingen test set with speeds of 470.07 FPS on 512 × 512 images and 5.46 FPS on 6000 × 6000 images using an RTX 3090 GPU.https://www.mdpi.com/2072-4292/14/21/5399convolutional neural network (CNN)deep supervisionlightweight modelremote sensingsemantic segmentation |
spellingShingle | Wenxu Shi Qingyan Meng Linlin Zhang Maofan Zhao Chen Su Tamás Jancsó DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery Remote Sensing convolutional neural network (CNN) deep supervision lightweight model remote sensing semantic segmentation |
title | DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery |
title_full | DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery |
title_fullStr | DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery |
title_full_unstemmed | DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery |
title_short | DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery |
title_sort | dsanet a deep supervision based simple attention network for efficient semantic segmentation in remote sensing imagery |
topic | convolutional neural network (CNN) deep supervision lightweight model remote sensing semantic segmentation |
url | https://www.mdpi.com/2072-4292/14/21/5399 |
work_keys_str_mv | AT wenxushi dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT qingyanmeng dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT linlinzhang dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT maofanzhao dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT chensu dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT tamasjancso dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery |