BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNN

Up-to-date 3D building models are important for many applications. Airborne very high resolution (VHR) images often acquired annually give an opportunity to create an up-to-date 3D model. Building segmentation is often the first and utmost step. Convolutional neural networks (CNNs) draw lots of atte...

Full description

Bibliographic Details
Main Authors: K. Zhou, Y. Chen, I. Smal, R. Lindenbergh
Format: Article
Language:English
Published: Copernicus Publications 2019-06-01
Series:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-2-W13/155/2019/isprs-archives-XLII-2-W13-155-2019.pdf
_version_ 1828480932159422464
author K. Zhou
Y. Chen
I. Smal
R. Lindenbergh
author_facet K. Zhou
Y. Chen
I. Smal
R. Lindenbergh
author_sort K. Zhou
collection DOAJ
description Up-to-date 3D building models are important for many applications. Airborne very high resolution (VHR) images often acquired annually give an opportunity to create an up-to-date 3D model. Building segmentation is often the first and utmost step. Convolutional neural networks (CNNs) draw lots of attention in interpreting VHR images as they can learn very effective features for very complex scenes. This paper employs Mask R-CNN to address two problems in building segmentation: detecting different scales of building and segmenting buildings to have accurately segmented edges. Mask R-CNN starts from feature pyramid network (FPN) to create different scales of semantically rich features. FPN is integrated with region proposal network (RPN) to generate objects with various scales with the corresponding optimal scale of features. The features with high and low levels of information are further used for better object classification of small objects and for mask prediction of edges. The method is tested on ISPRS benchmark dataset by comparing results with the fully convolutional networks (FCN), which merge high and low level features by a skip-layer to create a single feature for semantic segmentation. The results show that Mask R-CNN outperforms FCN with around 15% in detecting objects, especially in detecting small objects. Moreover, Mask R-CNN has much better results in edge region than FCN. The results also show that choosing the range of anchor scales in Mask R-CNN is a critical factor in segmenting different scale of objects. This paper provides an insight into how a good anchor scale for different dataset should be chosen.
first_indexed 2024-12-11T07:47:45Z
format Article
id doaj.art-0169825740fc468c8ad7851e9cd5e63d
institution Directory Open Access Journal
issn 1682-1750
2194-9034
language English
last_indexed 2024-12-11T07:47:45Z
publishDate 2019-06-01
publisher Copernicus Publications
record_format Article
series The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
spelling doaj.art-0169825740fc468c8ad7851e9cd5e63d2022-12-22T01:15:25ZengCopernicus PublicationsThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences1682-17502194-90342019-06-01XLII-2-W1315516110.5194/isprs-archives-XLII-2-W13-155-2019BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNNK. Zhou0Y. Chen1I. Smal2R. Lindenbergh3Dept. of Geoscience and Remote Sensing, Delft University of Technology, the NetherlandsDept. of Computational Science and Engineering, Delft University of Technology, the NetherlandsDept. of Geoscience and Remote Sensing, Delft University of Technology, the NetherlandsDept. of Geoscience and Remote Sensing, Delft University of Technology, the NetherlandsUp-to-date 3D building models are important for many applications. Airborne very high resolution (VHR) images often acquired annually give an opportunity to create an up-to-date 3D model. Building segmentation is often the first and utmost step. Convolutional neural networks (CNNs) draw lots of attention in interpreting VHR images as they can learn very effective features for very complex scenes. This paper employs Mask R-CNN to address two problems in building segmentation: detecting different scales of building and segmenting buildings to have accurately segmented edges. Mask R-CNN starts from feature pyramid network (FPN) to create different scales of semantically rich features. FPN is integrated with region proposal network (RPN) to generate objects with various scales with the corresponding optimal scale of features. The features with high and low levels of information are further used for better object classification of small objects and for mask prediction of edges. The method is tested on ISPRS benchmark dataset by comparing results with the fully convolutional networks (FCN), which merge high and low level features by a skip-layer to create a single feature for semantic segmentation. The results show that Mask R-CNN outperforms FCN with around 15% in detecting objects, especially in detecting small objects. Moreover, Mask R-CNN has much better results in edge region than FCN. The results also show that choosing the range of anchor scales in Mask R-CNN is a critical factor in segmenting different scale of objects. This paper provides an insight into how a good anchor scale for different dataset should be chosen.https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-2-W13/155/2019/isprs-archives-XLII-2-W13-155-2019.pdf
spellingShingle K. Zhou
Y. Chen
I. Smal
R. Lindenbergh
BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNN
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
title BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNN
title_full BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNN
title_fullStr BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNN
title_full_unstemmed BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNN
title_short BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNN
title_sort building segmentation from airborne vhr images using mask r cnn
url https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-2-W13/155/2019/isprs-archives-XLII-2-W13-155-2019.pdf
work_keys_str_mv AT kzhou buildingsegmentationfromairbornevhrimagesusingmaskrcnn
AT ychen buildingsegmentationfromairbornevhrimagesusingmaskrcnn
AT ismal buildingsegmentationfromairbornevhrimagesusingmaskrcnn
AT rlindenbergh buildingsegmentationfromairbornevhrimagesusingmaskrcnn