On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN
Building footprint (BFP) extraction focuses on the precise pixel-wise segmentation of buildings from aerial photographs such as satellite images. BFP extraction is an essential task in remote sensing and represents the foundation for many higher-level analysis tasks, such as disaster management, mon...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-04-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/15/8/2135 |
_version_ | 1797603579424407552 |
---|---|
author | Muntaha Sakeena Eric Stumpe Miroslav Despotovic David Koch Matthias Zeppelzauer |
author_facet | Muntaha Sakeena Eric Stumpe Miroslav Despotovic David Koch Matthias Zeppelzauer |
author_sort | Muntaha Sakeena |
collection | DOAJ |
description | Building footprint (BFP) extraction focuses on the precise pixel-wise segmentation of buildings from aerial photographs such as satellite images. BFP extraction is an essential task in remote sensing and represents the foundation for many higher-level analysis tasks, such as disaster management, monitoring of city development, etc. Building footprint extraction is challenging because buildings can have different sizes, shapes, and appearances both in the same region and in different regions of the world. In addition, effects, such as occlusions, shadows, and bad lighting, have to also be considered and compensated. A rich body of work for BFP extraction has been presented in the literature, and promising research results have been reported on benchmarking datasets. Despite the comprehensive work performed, it is still unclear how robust and generalizable state-of-the-art methods are to different regions, cities, settlement structures, and densities. The purpose of this study is to close this gap by investigating questions on the practical applicability of BFP extraction. In particular, we evaluate the robustness and generalizability of state-of-the-art methods as well as their transfer learning capabilities. Therefore, we investigate in detail two of the most popular deep learning architectures for BFP extraction (i.e., SegNet, an encoder–decoder-based architecture and Mask R-CNN, an object detection architecture) and evaluate them with respect to different aspects on a proprietary high-resolution satellite image dataset as well as on publicly available datasets. Results show that both networks generalize well to new data, new cities, and across cities from different continents. They both benefit from increased training data, especially when this data is from the same distribution (data source) or of comparable resolution. Transfer learning from a data source with different recording parameters is not always beneficial. |
first_indexed | 2024-03-11T04:34:05Z |
format | Article |
id | doaj.art-767c2484afbf44398af54aa94a0a1690 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-11T04:34:05Z |
publishDate | 2023-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-767c2484afbf44398af54aa94a0a16902023-11-17T21:12:32ZengMDPI AGRemote Sensing2072-42922023-04-01158213510.3390/rs15082135On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNNMuntaha Sakeena0Eric Stumpe1Miroslav Despotovic2David Koch3Matthias Zeppelzauer4Institute of Creative Media Technologies, St. Pölten University of Applied Sciences, Campus-Platz 1, A-3100 St. Pölten, AustriaInstitute of Creative Media Technologies, St. Pölten University of Applied Sciences, Campus-Platz 1, A-3100 St. Pölten, AustriaInstitute for Energy, Facility & Real Estate Management, University of Applied Sciences Kufstein Tirol, Andreas Hofer-Strasse 7, A-6330 Kufstein, AustriaInstitute for Energy, Facility & Real Estate Management, University of Applied Sciences Kufstein Tirol, Andreas Hofer-Strasse 7, A-6330 Kufstein, AustriaInstitute of Creative Media Technologies, St. Pölten University of Applied Sciences, Campus-Platz 1, A-3100 St. Pölten, AustriaBuilding footprint (BFP) extraction focuses on the precise pixel-wise segmentation of buildings from aerial photographs such as satellite images. BFP extraction is an essential task in remote sensing and represents the foundation for many higher-level analysis tasks, such as disaster management, monitoring of city development, etc. Building footprint extraction is challenging because buildings can have different sizes, shapes, and appearances both in the same region and in different regions of the world. In addition, effects, such as occlusions, shadows, and bad lighting, have to also be considered and compensated. A rich body of work for BFP extraction has been presented in the literature, and promising research results have been reported on benchmarking datasets. Despite the comprehensive work performed, it is still unclear how robust and generalizable state-of-the-art methods are to different regions, cities, settlement structures, and densities. The purpose of this study is to close this gap by investigating questions on the practical applicability of BFP extraction. In particular, we evaluate the robustness and generalizability of state-of-the-art methods as well as their transfer learning capabilities. Therefore, we investigate in detail two of the most popular deep learning architectures for BFP extraction (i.e., SegNet, an encoder–decoder-based architecture and Mask R-CNN, an object detection architecture) and evaluate them with respect to different aspects on a proprietary high-resolution satellite image dataset as well as on publicly available datasets. Results show that both networks generalize well to new data, new cities, and across cities from different continents. They both benefit from increased training data, especially when this data is from the same distribution (data source) or of comparable resolution. Transfer learning from a data source with different recording parameters is not always beneficial.https://www.mdpi.com/2072-4292/15/8/2135building footprint extractionsatellite image segmentationbuilding detectioncomparative studyrobustness evaluation |
spellingShingle | Muntaha Sakeena Eric Stumpe Miroslav Despotovic David Koch Matthias Zeppelzauer On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN Remote Sensing building footprint extraction satellite image segmentation building detection comparative study robustness evaluation |
title | On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN |
title_full | On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN |
title_fullStr | On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN |
title_full_unstemmed | On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN |
title_short | On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN |
title_sort | on the robustness and generalization ability of building footprint extraction on the example of segnet and mask r cnn |
topic | building footprint extraction satellite image segmentation building detection comparative study robustness evaluation |
url | https://www.mdpi.com/2072-4292/15/8/2135 |
work_keys_str_mv | AT muntahasakeena ontherobustnessandgeneralizationabilityofbuildingfootprintextractionontheexampleofsegnetandmaskrcnn AT ericstumpe ontherobustnessandgeneralizationabilityofbuildingfootprintextractionontheexampleofsegnetandmaskrcnn AT miroslavdespotovic ontherobustnessandgeneralizationabilityofbuildingfootprintextractionontheexampleofsegnetandmaskrcnn AT davidkoch ontherobustnessandgeneralizationabilityofbuildingfootprintextractionontheexampleofsegnetandmaskrcnn AT matthiaszeppelzauer ontherobustnessandgeneralizationabilityofbuildingfootprintextractionontheexampleofsegnetandmaskrcnn |