DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery

Semantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for c...

Full description

Bibliographic Details
Main Authors: Wenxu Shi, Qingyan Meng, Linlin Zhang, Maofan Zhao, Chen Su, Tamás Jancsó
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/21/5399
_version_ 1797466632465940480
author Wenxu Shi
Qingyan Meng
Linlin Zhang
Maofan Zhao
Chen Su
Tamás Jancsó
author_facet Wenxu Shi
Qingyan Meng
Linlin Zhang
Maofan Zhao
Chen Su
Tamás Jancsó
author_sort Wenxu Shi
collection DOAJ
description Semantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for current complex networks to handle. Efficient networks are the key to solving this challenge. Many previous works aimed at designing lightweight networks or utilizing pruning and knowledge distillation methods to obtain efficient networks, but these methods inevitably reduce the ability of the resulting models to characterize spatial and semantic features. We propose an effective deep supervision-based simple attention network (DSANet) with spatial and semantic enhancement losses to handle these problems. In the network, (1) a lightweight architecture is used as the backbone; (2) deep supervision modules with improved multiscale spatial detail (MSD) and hierarchical semantic enhancement (HSE) losses synergistically strengthen the obtained feature representations; and (3) a simple embedding attention module (EAM) with linear complexity performs long-range relationship modeling. Experiments conducted on two public RSI datasets (the ISPRS Potsdam dataset and Vaihingen dataset) exhibit the substantial advantages of the proposed approach. Our method achieves 79.19% mean intersection over union (mIoU) on the ISPRS Potsdam test set and 72.26% mIoU on the Vaihingen test set with speeds of 470.07 FPS on 512 × 512 images and 5.46 FPS on 6000 × 6000 images using an RTX 3090 GPU.
first_indexed 2024-03-09T18:42:26Z
format Article
id doaj.art-ce763e4f87254a9db46109d42d70f540
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-09T18:42:26Z
publishDate 2022-10-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-ce763e4f87254a9db46109d42d70f5402023-11-24T06:38:24ZengMDPI AGRemote Sensing2072-42922022-10-011421539910.3390/rs14215399DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing ImageryWenxu Shi0Qingyan Meng1Linlin Zhang2Maofan Zhao3Chen Su4Tamás Jancsó5Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAlba Regia Technical Faculty, Obuda University, Budai ut 45, 8001 Szekesfehervar, HungarySemantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for current complex networks to handle. Efficient networks are the key to solving this challenge. Many previous works aimed at designing lightweight networks or utilizing pruning and knowledge distillation methods to obtain efficient networks, but these methods inevitably reduce the ability of the resulting models to characterize spatial and semantic features. We propose an effective deep supervision-based simple attention network (DSANet) with spatial and semantic enhancement losses to handle these problems. In the network, (1) a lightweight architecture is used as the backbone; (2) deep supervision modules with improved multiscale spatial detail (MSD) and hierarchical semantic enhancement (HSE) losses synergistically strengthen the obtained feature representations; and (3) a simple embedding attention module (EAM) with linear complexity performs long-range relationship modeling. Experiments conducted on two public RSI datasets (the ISPRS Potsdam dataset and Vaihingen dataset) exhibit the substantial advantages of the proposed approach. Our method achieves 79.19% mean intersection over union (mIoU) on the ISPRS Potsdam test set and 72.26% mIoU on the Vaihingen test set with speeds of 470.07 FPS on 512 × 512 images and 5.46 FPS on 6000 × 6000 images using an RTX 3090 GPU.https://www.mdpi.com/2072-4292/14/21/5399convolutional neural network (CNN)deep supervisionlightweight modelremote sensingsemantic segmentation
spellingShingle Wenxu Shi
Qingyan Meng
Linlin Zhang
Maofan Zhao
Chen Su
Tamás Jancsó
DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
Remote Sensing
convolutional neural network (CNN)
deep supervision
lightweight model
remote sensing
semantic segmentation
title DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_full DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_fullStr DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_full_unstemmed DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_short DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_sort dsanet a deep supervision based simple attention network for efficient semantic segmentation in remote sensing imagery
topic convolutional neural network (CNN)
deep supervision
lightweight model
remote sensing
semantic segmentation
url https://www.mdpi.com/2072-4292/14/21/5399
work_keys_str_mv AT wenxushi dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery
AT qingyanmeng dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery
AT linlinzhang dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery
AT maofanzhao dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery
AT chensu dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery
AT tamasjancso dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery