An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes

High-resolution image segmentation for landscape applications has garnered significant attention, particularly in the context of ultra-high-resolution (UHR) imagery. Current segmentation methodologies partition UHR images into standard patches for multiscale local segmentation and hierarchical reaso...

Full description

Bibliographic Details
Main Authors:	Lijia Chen, Honghui Chen, Yanqiu Xie, Tianyou He, Jing Ye, Yushan Zheng
Format:	Article
Language:	English
Published:	MDPI AG 2023-11-01
Series:	Forests
Subjects:	ultra-high-resolution image segmentation quality multilevel semantic contexts transformer
Online Access:	https://www.mdpi.com/1999-4907/14/11/2271

_version_	1797459277709836288
author	Lijia Chen Honghui Chen Yanqiu Xie Tianyou He Jing Ye Yushan Zheng
author_facet	Lijia Chen Honghui Chen Yanqiu Xie Tianyou He Jing Ye Yushan Zheng
author_sort	Lijia Chen
collection	DOAJ
description	High-resolution image segmentation for landscape applications has garnered significant attention, particularly in the context of ultra-high-resolution (UHR) imagery. Current segmentation methodologies partition UHR images into standard patches for multiscale local segmentation and hierarchical reasoning. This creates a pressing dilemma, where the trade-off between memory efficiency and segmentation quality becomes increasingly evident. This paper introduces the Multilevel Contexts Weighted Coupling Transformer (WCTNet) for UHR segmentation. This framework comprises the Mult-level Feature Weighting (MFW) module and Token-based Transformer (TT) designed to weigh and couple multilevel semantic contexts. First, we analyze the multilevel semantics within a local patch without image-level contextual reasoning. It avoids complex image-level contextual associations and eliminates the misleading information carried. Second, MFW is developed to weigh shallow and deep features for enhancing object-related attention at different grain sizes from multilevel semantics. Third, the TT module is introduced to couple multilevel semantic contexts and transform them into semantic tokens using spatial attention. Then, we can capture token interactions and obtain clearer local representations. The suggested contextual weighting and coupling of single-scale patches empower WCTNet to maintain a well-balanced relationship between accuracy and computational overhead. Experimental results show that WCTNet achieves state-of-the-art performance on two UHR datasets of DeepGlobe and Inria Aerial.
first_indexed	2024-03-09T16:49:09Z
format	Article
id	doaj.art-a45965c99e814d129d2fe1cca1a6b174
institution	Directory Open Access Journal
issn	1999-4907
language	English
last_indexed	2024-03-09T16:49:09Z
publishDate	2023-11-01
publisher	MDPI AG
record_format	Article
series	Forests
spelling	doaj.art-a45965c99e814d129d2fe1cca1a6b1742023-11-24T14:42:54ZengMDPI AGForests1999-49072023-11-011411227110.3390/f14112271An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of LandscapesLijia Chen0Honghui Chen1Yanqiu Xie2Tianyou He3Jing Ye4Yushan Zheng5College of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaDepartment of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, ChinaCollege of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaCollege of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaCollege of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaCollege of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaHigh-resolution image segmentation for landscape applications has garnered significant attention, particularly in the context of ultra-high-resolution (UHR) imagery. Current segmentation methodologies partition UHR images into standard patches for multiscale local segmentation and hierarchical reasoning. This creates a pressing dilemma, where the trade-off between memory efficiency and segmentation quality becomes increasingly evident. This paper introduces the Multilevel Contexts Weighted Coupling Transformer (WCTNet) for UHR segmentation. This framework comprises the Mult-level Feature Weighting (MFW) module and Token-based Transformer (TT) designed to weigh and couple multilevel semantic contexts. First, we analyze the multilevel semantics within a local patch without image-level contextual reasoning. It avoids complex image-level contextual associations and eliminates the misleading information carried. Second, MFW is developed to weigh shallow and deep features for enhancing object-related attention at different grain sizes from multilevel semantics. Third, the TT module is introduced to couple multilevel semantic contexts and transform them into semantic tokens using spatial attention. Then, we can capture token interactions and obtain clearer local representations. The suggested contextual weighting and coupling of single-scale patches empower WCTNet to maintain a well-balanced relationship between accuracy and computational overhead. Experimental results show that WCTNet achieves state-of-the-art performance on two UHR datasets of DeepGlobe and Inria Aerial.https://www.mdpi.com/1999-4907/14/11/2271ultra-high-resolution imagesegmentation qualitymultilevel semantic contextstransformer
spellingShingle	Lijia Chen Honghui Chen Yanqiu Xie Tianyou He Jing Ye Yushan Zheng An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes Forests ultra-high-resolution image segmentation quality multilevel semantic contexts transformer
title	An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes
title_full	An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes
title_fullStr	An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes
title_full_unstemmed	An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes
title_short	An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes
title_sort	efficient and light transformer based segmentation network for remote sensing images of landscapes
topic	ultra-high-resolution image segmentation quality multilevel semantic contexts transformer
url	https://www.mdpi.com/1999-4907/14/11/2271
work_keys_str_mv	AT lijiachen anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT honghuichen anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT yanqiuxie anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT tianyouhe anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT jingye anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT yushanzheng anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT lijiachen efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT honghuichen efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT yanqiuxie efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT tianyouhe efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT jingye efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT yushanzheng efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes

An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes

Similar Items