SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS

This paper describes a deep learning approach to semantic segmentation of very high resolution (aerial) images. Deep neural architectures hold the promise of end-to-end learning from raw images, making heuristic feature design obsolete. Over the last decade this idea has seen a revival, and in recen...

Full description

Bibliographic Details
Main Authors: D. Marmanis, J. D. Wegner, S. Galliani, K. Schindler, M. Datcu, U. Stilla
Format: Article
Language:English
Published: Copernicus Publications 2016-06-01
Series:ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:http://www.isprs-ann-photogramm-remote-sens-spatial-inf-sci.net/III-3/473/2016/isprs-annals-III-3-473-2016.pdf
_version_ 1818479457574322176
author D. Marmanis
J. D. Wegner
S. Galliani
K. Schindler
M. Datcu
U. Stilla
author_facet D. Marmanis
J. D. Wegner
S. Galliani
K. Schindler
M. Datcu
U. Stilla
author_sort D. Marmanis
collection DOAJ
description This paper describes a deep learning approach to semantic segmentation of very high resolution (aerial) images. Deep neural architectures hold the promise of end-to-end learning from raw images, making heuristic feature design obsolete. Over the last decade this idea has seen a revival, and in recent years deep convolutional neural networks (CNNs) have emerged as the method of choice for a range of image interpretation tasks like visual recognition and object detection. Still, standard CNNs do not lend themselves to per-pixel semantic segmentation, mainly because one of their fundamental principles is to gradually aggregate information over larger and larger image regions, making it hard to disentangle contributions from different pixels. Very recently two extensions of the CNN framework have made it possible to trace the semantic information back to a precise pixel position: deconvolutional network layers undo the spatial downsampling, and Fully Convolution Networks (FCNs) modify the fully connected classification layers of the network in such a way that the location of individual activations remains explicit. We design a FCN which takes as input intensity and range data and, with the help of aggressive deconvolution and recycling of early network layers, converts them into a pixelwise classification at full resolution. We discuss design choices and intricacies of such a network, and demonstrate that an ensemble of several networks achieves excellent results on challenging data such as the <i>ISPRS semantic labeling benchmark</i>, using only the raw data as input.
first_indexed 2024-12-10T11:10:50Z
format Article
id doaj.art-9c12940b47d4460d81dfb21babafb212
institution Directory Open Access Journal
issn 2194-9042
2194-9050
language English
last_indexed 2024-12-10T11:10:50Z
publishDate 2016-06-01
publisher Copernicus Publications
record_format Article
series ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
spelling doaj.art-9c12940b47d4460d81dfb21babafb2122022-12-22T01:51:25ZengCopernicus PublicationsISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences2194-90422194-90502016-06-01III-347348010.5194/isprs-annals-III-3-473-2016SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNSD. Marmanis0J. D. Wegner1S. Galliani2K. Schindler3M. Datcu4U. Stilla5DLR-DFD Department, German Aerospace Center, Oberpfaffenhofen, GermanyDLR-DFD Department, German Aerospace Center, Oberpfaffenhofen, GermanyPhotogrammetry and Remote Sensing, ETH Zurich, SwitzerlandPhotogrammetry and Remote Sensing, ETH Zurich, SwitzerlandDLR-IMF Department, German Aerospace Center, Oberpfaffenhofen, GermanyPhotogrammetry and Remote Sensing, TU M¨unchen, GermanyThis paper describes a deep learning approach to semantic segmentation of very high resolution (aerial) images. Deep neural architectures hold the promise of end-to-end learning from raw images, making heuristic feature design obsolete. Over the last decade this idea has seen a revival, and in recent years deep convolutional neural networks (CNNs) have emerged as the method of choice for a range of image interpretation tasks like visual recognition and object detection. Still, standard CNNs do not lend themselves to per-pixel semantic segmentation, mainly because one of their fundamental principles is to gradually aggregate information over larger and larger image regions, making it hard to disentangle contributions from different pixels. Very recently two extensions of the CNN framework have made it possible to trace the semantic information back to a precise pixel position: deconvolutional network layers undo the spatial downsampling, and Fully Convolution Networks (FCNs) modify the fully connected classification layers of the network in such a way that the location of individual activations remains explicit. We design a FCN which takes as input intensity and range data and, with the help of aggressive deconvolution and recycling of early network layers, converts them into a pixelwise classification at full resolution. We discuss design choices and intricacies of such a network, and demonstrate that an ensemble of several networks achieves excellent results on challenging data such as the <i>ISPRS semantic labeling benchmark</i>, using only the raw data as input.http://www.isprs-ann-photogramm-remote-sens-spatial-inf-sci.net/III-3/473/2016/isprs-annals-III-3-473-2016.pdf
spellingShingle D. Marmanis
J. D. Wegner
S. Galliani
K. Schindler
M. Datcu
U. Stilla
SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
title SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS
title_full SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS
title_fullStr SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS
title_full_unstemmed SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS
title_short SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS
title_sort semantic segmentation of aerial images with an ensemble of cnns
url http://www.isprs-ann-photogramm-remote-sens-spatial-inf-sci.net/III-3/473/2016/isprs-annals-III-3-473-2016.pdf
work_keys_str_mv AT dmarmanis semanticsegmentationofaerialimageswithanensembleofcnns
AT jdwegner semanticsegmentationofaerialimageswithanensembleofcnns
AT sgalliani semanticsegmentationofaerialimageswithanensembleofcnns
AT kschindler semanticsegmentationofaerialimageswithanensembleofcnns
AT mdatcu semanticsegmentationofaerialimageswithanensembleofcnns
AT ustilla semanticsegmentationofaerialimageswithanensembleofcnns