An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes
High-resolution image segmentation for landscape applications has garnered significant attention, particularly in the context of ultra-high-resolution (UHR) imagery. Current segmentation methodologies partition UHR images into standard patches for multiscale local segmentation and hierarchical reaso...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Forests |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4907/14/11/2271 |
_version_ | 1797459277709836288 |
---|---|
author | Lijia Chen Honghui Chen Yanqiu Xie Tianyou He Jing Ye Yushan Zheng |
author_facet | Lijia Chen Honghui Chen Yanqiu Xie Tianyou He Jing Ye Yushan Zheng |
author_sort | Lijia Chen |
collection | DOAJ |
description | High-resolution image segmentation for landscape applications has garnered significant attention, particularly in the context of ultra-high-resolution (UHR) imagery. Current segmentation methodologies partition UHR images into standard patches for multiscale local segmentation and hierarchical reasoning. This creates a pressing dilemma, where the trade-off between memory efficiency and segmentation quality becomes increasingly evident. This paper introduces the Multilevel Contexts Weighted Coupling Transformer (WCTNet) for UHR segmentation. This framework comprises the Mult-level Feature Weighting (MFW) module and Token-based Transformer (TT) designed to weigh and couple multilevel semantic contexts. First, we analyze the multilevel semantics within a local patch without image-level contextual reasoning. It avoids complex image-level contextual associations and eliminates the misleading information carried. Second, MFW is developed to weigh shallow and deep features for enhancing object-related attention at different grain sizes from multilevel semantics. Third, the TT module is introduced to couple multilevel semantic contexts and transform them into semantic tokens using spatial attention. Then, we can capture token interactions and obtain clearer local representations. The suggested contextual weighting and coupling of single-scale patches empower WCTNet to maintain a well-balanced relationship between accuracy and computational overhead. Experimental results show that WCTNet achieves state-of-the-art performance on two UHR datasets of DeepGlobe and Inria Aerial. |
first_indexed | 2024-03-09T16:49:09Z |
format | Article |
id | doaj.art-a45965c99e814d129d2fe1cca1a6b174 |
institution | Directory Open Access Journal |
issn | 1999-4907 |
language | English |
last_indexed | 2024-03-09T16:49:09Z |
publishDate | 2023-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Forests |
spelling | doaj.art-a45965c99e814d129d2fe1cca1a6b1742023-11-24T14:42:54ZengMDPI AGForests1999-49072023-11-011411227110.3390/f14112271An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of LandscapesLijia Chen0Honghui Chen1Yanqiu Xie2Tianyou He3Jing Ye4Yushan Zheng5College of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaDepartment of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, ChinaCollege of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaCollege of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaCollege of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaCollege of Landscape Architecture, Fujian Agriculture and Forest University, Fuzhou 350002, ChinaHigh-resolution image segmentation for landscape applications has garnered significant attention, particularly in the context of ultra-high-resolution (UHR) imagery. Current segmentation methodologies partition UHR images into standard patches for multiscale local segmentation and hierarchical reasoning. This creates a pressing dilemma, where the trade-off between memory efficiency and segmentation quality becomes increasingly evident. This paper introduces the Multilevel Contexts Weighted Coupling Transformer (WCTNet) for UHR segmentation. This framework comprises the Mult-level Feature Weighting (MFW) module and Token-based Transformer (TT) designed to weigh and couple multilevel semantic contexts. First, we analyze the multilevel semantics within a local patch without image-level contextual reasoning. It avoids complex image-level contextual associations and eliminates the misleading information carried. Second, MFW is developed to weigh shallow and deep features for enhancing object-related attention at different grain sizes from multilevel semantics. Third, the TT module is introduced to couple multilevel semantic contexts and transform them into semantic tokens using spatial attention. Then, we can capture token interactions and obtain clearer local representations. The suggested contextual weighting and coupling of single-scale patches empower WCTNet to maintain a well-balanced relationship between accuracy and computational overhead. Experimental results show that WCTNet achieves state-of-the-art performance on two UHR datasets of DeepGlobe and Inria Aerial.https://www.mdpi.com/1999-4907/14/11/2271ultra-high-resolution imagesegmentation qualitymultilevel semantic contextstransformer |
spellingShingle | Lijia Chen Honghui Chen Yanqiu Xie Tianyou He Jing Ye Yushan Zheng An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes Forests ultra-high-resolution image segmentation quality multilevel semantic contexts transformer |
title | An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes |
title_full | An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes |
title_fullStr | An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes |
title_full_unstemmed | An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes |
title_short | An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes |
title_sort | efficient and light transformer based segmentation network for remote sensing images of landscapes |
topic | ultra-high-resolution image segmentation quality multilevel semantic contexts transformer |
url | https://www.mdpi.com/1999-4907/14/11/2271 |
work_keys_str_mv | AT lijiachen anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT honghuichen anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT yanqiuxie anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT tianyouhe anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT jingye anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT yushanzheng anefficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT lijiachen efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT honghuichen efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT yanqiuxie efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT tianyouhe efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT jingye efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes AT yushanzheng efficientandlighttransformerbasedsegmentationnetworkforremotesensingimagesoflandscapes |