Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints

Obtaining high quality labels is a major challenge for the application of deep neural networks in the remote sensing domain. A common way of acquiring labels is the usage of crowd sourcing which can provide much needed training data sets but also often contains incorrect labels which can affect the...

Full description

Bibliographic Details
Main Authors: Hannah Ulman, Jonas Gütter, Julia Niebling
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-01-01
Series:Frontiers in Remote Sensing
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frsen.2022.1100012/full
_version_ 1797956573431070720
author Hannah Ulman
Jonas Gütter
Julia Niebling
author_facet Hannah Ulman
Jonas Gütter
Julia Niebling
author_sort Hannah Ulman
collection DOAJ
description Obtaining high quality labels is a major challenge for the application of deep neural networks in the remote sensing domain. A common way of acquiring labels is the usage of crowd sourcing which can provide much needed training data sets but also often contains incorrect labels which can affect the training process of a deep neural network significantly. In this paper, we exploit uncertainty to identify a certain type of label noise for semantic segmentation of buildings in satellite imagery. That type of label noise is known as “omission noise,” i.e., missing labels for whole buildings which still appear in the satellite image. Following the literature, uncertainty during training can help in identifying the “sweet spot” between generalizing well and overfitting to label noise, which is further used to differentiate between noisy and clean labels. The differentiation between clean and noisy labels is based on pixel-wise uncertainty estimation and beta distribution fitting to the uncertainty estimates. For our study, we create a data set for building segmentation with different levels of omission noise to evaluate the impact of the noise level on the performance of the deep neural network during training. In doing so, we show that established uncertainty-based methods to identify noisy labels are in general not sufficient enough for our kind of remote sensing data. On the other hand, for some noise levels, we observe some promising differences between noisy and clean data which opens the possibility to refine the state-of-the-art methods further.
first_indexed 2024-04-10T23:51:01Z
format Article
id doaj.art-20cc4bb67afc46e39fff99ddaffab5fe
institution Directory Open Access Journal
issn 2673-6187
language English
last_indexed 2024-04-10T23:51:01Z
publishDate 2023-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Remote Sensing
spelling doaj.art-20cc4bb67afc46e39fff99ddaffab5fe2023-01-10T17:59:36ZengFrontiers Media S.A.Frontiers in Remote Sensing2673-61872023-01-01310.3389/frsen.2022.11000121100012Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprintsHannah Ulman0Jonas Gütter1Julia Niebling2Princeton University, Princeton, NJ, United StatesInstitute of Data Science, Data Analysis, and Intelligence, German Aerospace Center (DLR), Jena, GermanyInstitute of Data Science, Data Analysis, and Intelligence, German Aerospace Center (DLR), Jena, GermanyObtaining high quality labels is a major challenge for the application of deep neural networks in the remote sensing domain. A common way of acquiring labels is the usage of crowd sourcing which can provide much needed training data sets but also often contains incorrect labels which can affect the training process of a deep neural network significantly. In this paper, we exploit uncertainty to identify a certain type of label noise for semantic segmentation of buildings in satellite imagery. That type of label noise is known as “omission noise,” i.e., missing labels for whole buildings which still appear in the satellite image. Following the literature, uncertainty during training can help in identifying the “sweet spot” between generalizing well and overfitting to label noise, which is further used to differentiate between noisy and clean labels. The differentiation between clean and noisy labels is based on pixel-wise uncertainty estimation and beta distribution fitting to the uncertainty estimates. For our study, we create a data set for building segmentation with different levels of omission noise to evaluate the impact of the noise level on the performance of the deep neural network during training. In doing so, we show that established uncertainty-based methods to identify noisy labels are in general not sufficient enough for our kind of remote sensing data. On the other hand, for some noise levels, we observe some promising differences between noisy and clean data which opens the possibility to refine the state-of-the-art methods further.https://www.frontiersin.org/articles/10.3389/frsen.2022.1100012/fulldeep learningremote sensinguncertaintylabel noisesegmentation
spellingShingle Hannah Ulman
Jonas Gütter
Julia Niebling
Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints
Frontiers in Remote Sensing
deep learning
remote sensing
uncertainty
label noise
segmentation
title Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints
title_full Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints
title_fullStr Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints
title_full_unstemmed Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints
title_short Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints
title_sort uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints
topic deep learning
remote sensing
uncertainty
label noise
segmentation
url https://www.frontiersin.org/articles/10.3389/frsen.2022.1100012/full
work_keys_str_mv AT hannahulman uncertaintyisnotsufficientforidentifyingnoisylabelsintrainingdataforbinarysegmentationofbuildingfootprints
AT jonasgutter uncertaintyisnotsufficientforidentifyingnoisylabelsintrainingdataforbinarysegmentationofbuildingfootprints
AT julianiebling uncertaintyisnotsufficientforidentifyingnoisylabelsintrainingdataforbinarysegmentationofbuildingfootprints