A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor Filter

Semantic segmentation of remote sensing images is increasingly important in urban planning, autonomous driving, disaster monitoring, and land cover classification. With the development of high-resolution remote sensing satellite technology, multilevel, large-scale, and high-precision segmentation ha...

Full description

Bibliographic Details
Main Authors: Dongdong Feng, Zhihua Zhang, Kun Yan
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9837069/
_version_ 1828190803801931776
author Dongdong Feng
Zhihua Zhang
Kun Yan
author_facet Dongdong Feng
Zhihua Zhang
Kun Yan
author_sort Dongdong Feng
collection DOAJ
description Semantic segmentation of remote sensing images is increasingly important in urban planning, autonomous driving, disaster monitoring, and land cover classification. With the development of high-resolution remote sensing satellite technology, multilevel, large-scale, and high-precision segmentation has become the focus of current research. High-resolution remote sensing images have high intraclass diversity and low interclass separability, which pose challenges to the precision of the detailed representation of multiscale information. In this paper, a semantic segmentation method for remote sensing images based on Swin Transformer fusion with a Gabor filter is proposed. First, a Swin Transformer is used as the backbone network to extract image information at different levels. Then, the texture and edge features of the input image are extracted with a Gabor filter, and the multilevel features are merged by introducing a feature aggregation module (FAM) and an attentional embedding module (AEM). Finally, the segmentation result is optimized with the fully connected conditional random field (FC-CRF). Our proposed method, called Swin-S-GF, its mean Intersection over Union (mIoU) scored 80.14%, 66.50%, and 70.61% on the large-scale classification set, the fine land-cover classification set, and the “AI + Remote Sensing imaging dataset” (AI+RS dataset), respectively. Compared with DeepLabV3, mIoU increased by 0.67%, 3.43%, and 3.80%, respectively. Therefore, we believe that this model provides a good tool for the semantic segmentation of high-precision remote sensing images.
first_indexed 2024-04-12T08:27:45Z
format Article
id doaj.art-9fb1cfde0459499b8e7348a6589106ed
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-12T08:27:45Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-9fb1cfde0459499b8e7348a6589106ed2022-12-22T03:40:18ZengIEEEIEEE Access2169-35362022-01-0110774327745110.1109/ACCESS.2022.31932489837069A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor FilterDongdong Feng0https://orcid.org/0000-0003-0363-7958Zhihua Zhang1Kun Yan2https://orcid.org/0000-0002-0480-5530Faculty of Geomatics, Lanzhou Jiaotong University, Lanzhou, ChinaNational-Local Joint Engineering Research Center of Technologies and Applications for National Geographic State Monitoring, Lanzhou, ChinaGansu Provincial Engineering Laboratory for National Geographic State Monitoring, Lanzhou, ChinaSemantic segmentation of remote sensing images is increasingly important in urban planning, autonomous driving, disaster monitoring, and land cover classification. With the development of high-resolution remote sensing satellite technology, multilevel, large-scale, and high-precision segmentation has become the focus of current research. High-resolution remote sensing images have high intraclass diversity and low interclass separability, which pose challenges to the precision of the detailed representation of multiscale information. In this paper, a semantic segmentation method for remote sensing images based on Swin Transformer fusion with a Gabor filter is proposed. First, a Swin Transformer is used as the backbone network to extract image information at different levels. Then, the texture and edge features of the input image are extracted with a Gabor filter, and the multilevel features are merged by introducing a feature aggregation module (FAM) and an attentional embedding module (AEM). Finally, the segmentation result is optimized with the fully connected conditional random field (FC-CRF). Our proposed method, called Swin-S-GF, its mean Intersection over Union (mIoU) scored 80.14%, 66.50%, and 70.61% on the large-scale classification set, the fine land-cover classification set, and the “AI + Remote Sensing imaging dataset” (AI+RS dataset), respectively. Compared with DeepLabV3, mIoU increased by 0.67%, 3.43%, and 3.80%, respectively. Therefore, we believe that this model provides a good tool for the semantic segmentation of high-precision remote sensing images.https://ieeexplore.ieee.org/document/9837069/FAMGabor filterremote sensingsemantic segmentationSwin transformer
spellingShingle Dongdong Feng
Zhihua Zhang
Kun Yan
A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor Filter
IEEE Access
FAM
Gabor filter
remote sensing
semantic segmentation
Swin transformer
title A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor Filter
title_full A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor Filter
title_fullStr A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor Filter
title_full_unstemmed A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor Filter
title_short A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor Filter
title_sort semantic segmentation method for remote sensing images based on the swin transformer fusion gabor filter
topic FAM
Gabor filter
remote sensing
semantic segmentation
Swin transformer
url https://ieeexplore.ieee.org/document/9837069/
work_keys_str_mv AT dongdongfeng asemanticsegmentationmethodforremotesensingimagesbasedontheswintransformerfusiongaborfilter
AT zhihuazhang asemanticsegmentationmethodforremotesensingimagesbasedontheswintransformerfusiongaborfilter
AT kunyan asemanticsegmentationmethodforremotesensingimagesbasedontheswintransformerfusiongaborfilter
AT dongdongfeng semanticsegmentationmethodforremotesensingimagesbasedontheswintransformerfusiongaborfilter
AT zhihuazhang semanticsegmentationmethodforremotesensingimagesbasedontheswintransformerfusiongaborfilter
AT kunyan semanticsegmentationmethodforremotesensingimagesbasedontheswintransformerfusiongaborfilter