DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery

Semantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for c...

Full description

Bibliographic Details
Main Authors:	Wenxu Shi, Qingyan Meng, Linlin Zhang, Maofan Zhao, Chen Su, Tamás Jancsó
Format:	Article
Language:	English
Published:	MDPI AG 2022-10-01
Series:	Remote Sensing
Subjects:	convolutional neural network (CNN) deep supervision lightweight model remote sensing semantic segmentation
Online Access:	https://www.mdpi.com/2072-4292/14/21/5399

_version_	1797466632465940480
author	Wenxu Shi Qingyan Meng Linlin Zhang Maofan Zhao Chen Su Tamás Jancsó
author_facet	Wenxu Shi Qingyan Meng Linlin Zhang Maofan Zhao Chen Su Tamás Jancsó
author_sort	Wenxu Shi
collection	DOAJ
description	Semantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for current complex networks to handle. Efficient networks are the key to solving this challenge. Many previous works aimed at designing lightweight networks or utilizing pruning and knowledge distillation methods to obtain efficient networks, but these methods inevitably reduce the ability of the resulting models to characterize spatial and semantic features. We propose an effective deep supervision-based simple attention network (DSANet) with spatial and semantic enhancement losses to handle these problems. In the network, (1) a lightweight architecture is used as the backbone; (2) deep supervision modules with improved multiscale spatial detail (MSD) and hierarchical semantic enhancement (HSE) losses synergistically strengthen the obtained feature representations; and (3) a simple embedding attention module (EAM) with linear complexity performs long-range relationship modeling. Experiments conducted on two public RSI datasets (the ISPRS Potsdam dataset and Vaihingen dataset) exhibit the substantial advantages of the proposed approach. Our method achieves 79.19% mean intersection over union (mIoU) on the ISPRS Potsdam test set and 72.26% mIoU on the Vaihingen test set with speeds of 470.07 FPS on 512 × 512 images and 5.46 FPS on 6000 × 6000 images using an RTX 3090 GPU.
first_indexed	2024-03-09T18:42:26Z
format	Article
id	doaj.art-ce763e4f87254a9db46109d42d70f540
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-03-09T18:42:26Z
publishDate	2022-10-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-ce763e4f87254a9db46109d42d70f5402023-11-24T06:38:24ZengMDPI AGRemote Sensing2072-42922022-10-011421539910.3390/rs14215399DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing ImageryWenxu Shi0Qingyan Meng1Linlin Zhang2Maofan Zhao3Chen Su4Tamás Jancsó5Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, ChinaAlba Regia Technical Faculty, Obuda University, Budai ut 45, 8001 Szekesfehervar, HungarySemantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for current complex networks to handle. Efficient networks are the key to solving this challenge. Many previous works aimed at designing lightweight networks or utilizing pruning and knowledge distillation methods to obtain efficient networks, but these methods inevitably reduce the ability of the resulting models to characterize spatial and semantic features. We propose an effective deep supervision-based simple attention network (DSANet) with spatial and semantic enhancement losses to handle these problems. In the network, (1) a lightweight architecture is used as the backbone; (2) deep supervision modules with improved multiscale spatial detail (MSD) and hierarchical semantic enhancement (HSE) losses synergistically strengthen the obtained feature representations; and (3) a simple embedding attention module (EAM) with linear complexity performs long-range relationship modeling. Experiments conducted on two public RSI datasets (the ISPRS Potsdam dataset and Vaihingen dataset) exhibit the substantial advantages of the proposed approach. Our method achieves 79.19% mean intersection over union (mIoU) on the ISPRS Potsdam test set and 72.26% mIoU on the Vaihingen test set with speeds of 470.07 FPS on 512 × 512 images and 5.46 FPS on 6000 × 6000 images using an RTX 3090 GPU.https://www.mdpi.com/2072-4292/14/21/5399convolutional neural network (CNN)deep supervisionlightweight modelremote sensingsemantic segmentation
spellingShingle	Wenxu Shi Qingyan Meng Linlin Zhang Maofan Zhao Chen Su Tamás Jancsó DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery Remote Sensing convolutional neural network (CNN) deep supervision lightweight model remote sensing semantic segmentation
title	DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_full	DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_fullStr	DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_full_unstemmed	DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_short	DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery
title_sort	dsanet a deep supervision based simple attention network for efficient semantic segmentation in remote sensing imagery
topic	convolutional neural network (CNN) deep supervision lightweight model remote sensing semantic segmentation
url	https://www.mdpi.com/2072-4292/14/21/5399
work_keys_str_mv	AT wenxushi dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT qingyanmeng dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT linlinzhang dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT maofanzhao dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT chensu dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery AT tamasjancso dsanetadeepsupervisionbasedsimpleattentionnetworkforefficientsemanticsegmentationinremotesensingimagery

DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery

Similar Items