Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network

Building extraction from high spatial resolution remote sensing images is a hot spot in the field of remote sensing applications and computer vision. This paper presents a semantic segmentation model, which is a supervised method, named Pyramid Self-Attention Network (PISANet). Its structure is simp...

Full description

Bibliographic Details
Main Authors:	Dengji Zhou, Guizhou Wang, Guojin He, Tengfei Long, Ranyu Yin, Zhaoming Zhang, Sibao Chen, Bin Luo
Format:	Article
Language:	English
Published:	MDPI AG 2020-12-01
Series:	Sensors
Subjects:	building extraction high resolution image semantic segmentation deep learning
Online Access:	https://www.mdpi.com/1424-8220/20/24/7241

_version_	1797544346945323008
author	Dengji Zhou Guizhou Wang Guojin He Tengfei Long Ranyu Yin Zhaoming Zhang Sibao Chen Bin Luo
author_facet	Dengji Zhou Guizhou Wang Guojin He Tengfei Long Ranyu Yin Zhaoming Zhang Sibao Chen Bin Luo
author_sort	Dengji Zhou
collection	DOAJ
description	Building extraction from high spatial resolution remote sensing images is a hot spot in the field of remote sensing applications and computer vision. This paper presents a semantic segmentation model, which is a supervised method, named Pyramid Self-Attention Network (PISANet). Its structure is simple, because it contains only two parts: one is the backbone of the network, which is used to learn the local features (short distance context information around the pixel) of buildings from the image; the other part is the pyramid self-attention module, which is used to obtain the global features (long distance context information with other pixels in the image) and the comprehensive features (includes color, texture, geometric and high-level semantic feature) of the building. The network is an end-to-end approach. In the training stage, the input is the remote sensing image and corresponding label, and the output is probability map (the probability that each pixel is or is not building). In the prediction stage, the input is the remote sensing image, and the output is the extraction result of the building. The complexity of the network structure was reduced so that it is easy to implement. The proposed PISANet was tested on two datasets. The result shows that the overall accuracy reached 94.50 and 96.15%, the intersection-over-union reached 77.45 and 87.97%, and F1 index reached 87.27 and 93.55%, respectively. In experiments on different datasets, PISANet obtained high overall accuracy, low error rate and improved integrity of individual buildings.
first_indexed	2024-03-10T13:59:10Z
format	Article
id	doaj.art-79f805ed900f430b84d37d2e6d357f69
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-10T13:59:10Z
publishDate	2020-12-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-79f805ed900f430b84d37d2e6d357f692023-11-21T01:17:48ZengMDPI AGSensors1424-82202020-12-012024724110.3390/s20247241Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention NetworkDengji Zhou0Guizhou Wang1Guojin He2Tengfei Long3Ranyu Yin4Zhaoming Zhang5Sibao Chen6Bin Luo7Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, ChinaMOE Key Lab of Signal Processing and Intelligent Computing, School of Computer Science and Technology, Anhui University, Hefei 230601, ChinaMOE Key Lab of Signal Processing and Intelligent Computing, School of Computer Science and Technology, Anhui University, Hefei 230601, ChinaBuilding extraction from high spatial resolution remote sensing images is a hot spot in the field of remote sensing applications and computer vision. This paper presents a semantic segmentation model, which is a supervised method, named Pyramid Self-Attention Network (PISANet). Its structure is simple, because it contains only two parts: one is the backbone of the network, which is used to learn the local features (short distance context information around the pixel) of buildings from the image; the other part is the pyramid self-attention module, which is used to obtain the global features (long distance context information with other pixels in the image) and the comprehensive features (includes color, texture, geometric and high-level semantic feature) of the building. The network is an end-to-end approach. In the training stage, the input is the remote sensing image and corresponding label, and the output is probability map (the probability that each pixel is or is not building). In the prediction stage, the input is the remote sensing image, and the output is the extraction result of the building. The complexity of the network structure was reduced so that it is easy to implement. The proposed PISANet was tested on two datasets. The result shows that the overall accuracy reached 94.50 and 96.15%, the intersection-over-union reached 77.45 and 87.97%, and F1 index reached 87.27 and 93.55%, respectively. In experiments on different datasets, PISANet obtained high overall accuracy, low error rate and improved integrity of individual buildings.https://www.mdpi.com/1424-8220/20/24/7241building extractionhigh resolution imagesemantic segmentationdeep learning
spellingShingle	Dengji Zhou Guizhou Wang Guojin He Tengfei Long Ranyu Yin Zhaoming Zhang Sibao Chen Bin Luo Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network Sensors building extraction high resolution image semantic segmentation deep learning
title	Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network
title_full	Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network
title_fullStr	Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network
title_full_unstemmed	Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network
title_short	Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network
title_sort	robust building extraction for high spatial resolution remote sensing images with self attention network
topic	building extraction high resolution image semantic segmentation deep learning
url	https://www.mdpi.com/1424-8220/20/24/7241
work_keys_str_mv	AT dengjizhou robustbuildingextractionforhighspatialresolutionremotesensingimageswithselfattentionnetwork AT guizhouwang robustbuildingextractionforhighspatialresolutionremotesensingimageswithselfattentionnetwork AT guojinhe robustbuildingextractionforhighspatialresolutionremotesensingimageswithselfattentionnetwork AT tengfeilong robustbuildingextractionforhighspatialresolutionremotesensingimageswithselfattentionnetwork AT ranyuyin robustbuildingextractionforhighspatialresolutionremotesensingimageswithselfattentionnetwork AT zhaomingzhang robustbuildingextractionforhighspatialresolutionremotesensingimageswithselfattentionnetwork AT sibaochen robustbuildingextractionforhighspatialresolutionremotesensingimageswithselfattentionnetwork AT binluo robustbuildingextractionforhighspatialresolutionremotesensingimageswithselfattentionnetwork

Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network

Similar Items