A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information

Accurately extracting pixel-level buildings from high-resolution remote sensing images is significant for various geographical information applications. Influenced by different natural, cultural, and social development levels, buildings may vary in shape and distribution, making it difficult for the...

ver descrição completa

Detalhes bibliográficos
Principais autores: Zhaojun Pang, Rongming Hu, Wu Zhu, Renyi Zhu, Yuxin Liao, Xiying Han
Formato: Artigo
Idioma:English
Publicado em: MDPI AG 2024-02-01
coleção:Sensors
Assuntos:
Acesso em linha:https://www.mdpi.com/1424-8220/24/3/1006
_version_ 1827354517732065280
author Zhaojun Pang
Rongming Hu
Wu Zhu
Renyi Zhu
Yuxin Liao
Xiying Han
author_facet Zhaojun Pang
Rongming Hu
Wu Zhu
Renyi Zhu
Yuxin Liao
Xiying Han
author_sort Zhaojun Pang
collection DOAJ
description Accurately extracting pixel-level buildings from high-resolution remote sensing images is significant for various geographical information applications. Influenced by different natural, cultural, and social development levels, buildings may vary in shape and distribution, making it difficult for the network to maintain a stable segmentation effect of buildings in different areas of the image. In addition, the complex spectra of features in remote sensing images can affect the extracted details of multi-scale buildings in different ways. To this end, this study selects parts of Xi’an City, Shaanxi Province, China, as the study area. A parallel encoded building extraction network (MARS-Net) incorporating multiple attention mechanisms is proposed. MARS-Net builds its parallel encoder through DCNN and transformer to take advantage of their extraction of local and global features. According to the different depth positions of the network, coordinate attention (CA) and convolutional block attention module (CBAM) are introduced to bridge the encoder and decoder to retain richer spatial and semantic information during the encoding process, and adding the dense atrous spatial pyramid pooling (DenseASPP) captures multi-scale contextual information during the upsampling of the layers of the decoder. In addition, a spectral information enhancement module (SIEM) is designed in this study. SIEM further enhances building segmentation by blending and enhancing multi-band building information with relationships between bands. The experimental results show that MARS-Net performs better extraction results and obtains more effective enhancement after adding SIEM. The IoU on the self-built Xi’an and WHU building datasets are 87.53% and 89.62%, respectively, while the respective F1 scores are 93.34% and 94.52%.
first_indexed 2024-03-08T03:48:45Z
format Article
id doaj.art-5fd95c26032a45fba00e74951b3794ed
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-08T03:48:45Z
publishDate 2024-02-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-5fd95c26032a45fba00e74951b3794ed2024-02-09T15:22:33ZengMDPI AGSensors1424-82202024-02-01243100610.3390/s24031006A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral InformationZhaojun Pang0Rongming Hu1Wu Zhu2Renyi Zhu3Yuxin Liao4Xiying Han5School of Geomatics, Xi’an University of Science and Technology, Xi’an 710054, ChinaSchool of Geomatics, Xi’an University of Science and Technology, Xi’an 710054, ChinaSchool of Geological Engineering and Geomatics, Chang’an University, Xi’an 710054, ChinaThe First Institute of Geoinformation Mapping, Ministry of Natural Resources, Xi’an 710054, ChinaSchool of Geomatics, Xi’an University of Science and Technology, Xi’an 710054, ChinaThe First Institute of Geoinformation Mapping, Ministry of Natural Resources, Xi’an 710054, ChinaAccurately extracting pixel-level buildings from high-resolution remote sensing images is significant for various geographical information applications. Influenced by different natural, cultural, and social development levels, buildings may vary in shape and distribution, making it difficult for the network to maintain a stable segmentation effect of buildings in different areas of the image. In addition, the complex spectra of features in remote sensing images can affect the extracted details of multi-scale buildings in different ways. To this end, this study selects parts of Xi’an City, Shaanxi Province, China, as the study area. A parallel encoded building extraction network (MARS-Net) incorporating multiple attention mechanisms is proposed. MARS-Net builds its parallel encoder through DCNN and transformer to take advantage of their extraction of local and global features. According to the different depth positions of the network, coordinate attention (CA) and convolutional block attention module (CBAM) are introduced to bridge the encoder and decoder to retain richer spatial and semantic information during the encoding process, and adding the dense atrous spatial pyramid pooling (DenseASPP) captures multi-scale contextual information during the upsampling of the layers of the decoder. In addition, a spectral information enhancement module (SIEM) is designed in this study. SIEM further enhances building segmentation by blending and enhancing multi-band building information with relationships between bands. The experimental results show that MARS-Net performs better extraction results and obtains more effective enhancement after adding SIEM. The IoU on the self-built Xi’an and WHU building datasets are 87.53% and 89.62%, respectively, while the respective F1 scores are 93.34% and 94.52%.https://www.mdpi.com/1424-8220/24/3/1006high-resolution remote sensing imagerybuilding extractiondeep convolutional neural network (DCNN)transformerspectral enhancement
spellingShingle Zhaojun Pang
Rongming Hu
Wu Zhu
Renyi Zhu
Yuxin Liao
Xiying Han
A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information
Sensors
high-resolution remote sensing imagery
building extraction
deep convolutional neural network (DCNN)
transformer
spectral enhancement
title A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information
title_full A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information
title_fullStr A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information
title_full_unstemmed A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information
title_short A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information
title_sort building extraction method for high resolution remote sensing images with multiple attentions and parallel encoders combining enhanced spectral information
topic high-resolution remote sensing imagery
building extraction
deep convolutional neural network (DCNN)
transformer
spectral enhancement
url https://www.mdpi.com/1424-8220/24/3/1006
work_keys_str_mv AT zhaojunpang abuildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT rongminghu abuildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT wuzhu abuildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT renyizhu abuildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT yuxinliao abuildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT xiyinghan abuildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT zhaojunpang buildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT rongminghu buildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT wuzhu buildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT renyizhu buildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT yuxinliao buildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation
AT xiyinghan buildingextractionmethodforhighresolutionremotesensingimageswithmultipleattentionsandparallelencoderscombiningenhancedspectralinformation