Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation

It is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with t...

Full description

Bibliographic Details
Main Authors: Chengyi Wang, Lianfa Li
Format: Article
Language:English
Published: MDPI AG 2020-09-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/12/18/2932
_version_ 1797554081451999232
author Chengyi Wang
Lianfa Li
author_facet Chengyi Wang
Lianfa Li
author_sort Chengyi Wang
collection DOAJ
description It is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with the regularizer of shape representation for semantic segmentation of buildings. Based on the U-Net architecture using residual connections and multi-scale ASPP (atrous spatial pyramid pooling) modules, our method introduced the regularizer of shape representation and ensemble learning of multi-scale models to enhance model training and reduce over-fitting. In our method, the shape representation was coded in an antoencoder that was used to encode and reconstruct the shape characteristics of the buildings. In prediction, we consider multi-scale trained models for different resolution inputs and side effects to obtain an optimal semantic segmentation. With the high-resolution image of the Changshan, an island county in China, we used two-thirds of the study region image to train the model and the remaining one-third for the independent test. We obtained the accuracy of 0.98–0.99, mean intersection over union (MIoU) of 0.91–0.93 and Jaccard coefficient of 0.89–0.92 in validation. In the independent test, our method achieved state-of-the-art performance (MIoU: 0.83; Jaccard index: 0.81). By comparing with the existing representative methods on four different data sets, the proposed method consistently improved the learning process and generalization. The study shows important contributions of ensemble learning of multi-scale residual models and regularizer of shape representation to semantic segmentation of buildings.
first_indexed 2024-03-10T16:26:45Z
format Article
id doaj.art-b23cdb2f5e794f71a5c3772bd10d2aa1
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T16:26:45Z
publishDate 2020-09-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-b23cdb2f5e794f71a5c3772bd10d2aa12023-11-20T13:12:25ZengMDPI AGRemote Sensing2072-42922020-09-011218293210.3390/rs12182932Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape RepresentationChengyi Wang0Lianfa Li1National Engineering Research Center for Geomatics, Aerospace Information Research Institute, Chinese Academy of Sciences, Datun Road, Beijing 100101, ChinaState Key Laboratory of Resources and Environmental Information Systems, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Datun Road, Beijing 100101, ChinaIt is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with the regularizer of shape representation for semantic segmentation of buildings. Based on the U-Net architecture using residual connections and multi-scale ASPP (atrous spatial pyramid pooling) modules, our method introduced the regularizer of shape representation and ensemble learning of multi-scale models to enhance model training and reduce over-fitting. In our method, the shape representation was coded in an antoencoder that was used to encode and reconstruct the shape characteristics of the buildings. In prediction, we consider multi-scale trained models for different resolution inputs and side effects to obtain an optimal semantic segmentation. With the high-resolution image of the Changshan, an island county in China, we used two-thirds of the study region image to train the model and the remaining one-third for the independent test. We obtained the accuracy of 0.98–0.99, mean intersection over union (MIoU) of 0.91–0.93 and Jaccard coefficient of 0.89–0.92 in validation. In the independent test, our method achieved state-of-the-art performance (MIoU: 0.83; Jaccard index: 0.81). By comparing with the existing representative methods on four different data sets, the proposed method consistently improved the learning process and generalization. The study shows important contributions of ensemble learning of multi-scale residual models and regularizer of shape representation to semantic segmentation of buildings.https://www.mdpi.com/2072-4292/12/18/2932multiple scalesresidual deep ensemble learningregularizershape representationsemantic segmentation of buildings
spellingShingle Chengyi Wang
Lianfa Li
Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
Remote Sensing
multiple scales
residual deep ensemble learning
regularizer
shape representation
semantic segmentation of buildings
title Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_full Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_fullStr Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_full_unstemmed Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_short Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_sort multi scale residual deep network for semantic segmentation of buildings with regularizer of shape representation
topic multiple scales
residual deep ensemble learning
regularizer
shape representation
semantic segmentation of buildings
url https://www.mdpi.com/2072-4292/12/18/2932
work_keys_str_mv AT chengyiwang multiscaleresidualdeepnetworkforsemanticsegmentationofbuildingswithregularizerofshaperepresentation
AT lianfali multiscaleresidualdeepnetworkforsemanticsegmentationofbuildingswithregularizerofshaperepresentation