Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage

Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the u...

Full description

Bibliographic Details
Main Authors: Jared Frank, Umaa Rebbapragada, James Bialas, Thomas Oommen, Timothy C. Havens
Format: Article
Language:English
Published: MDPI AG 2017-08-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/9/8/803
_version_ 1798024491400429568
author Jared Frank
Umaa Rebbapragada
James Bialas
Thomas Oommen
Timothy C. Havens
author_facet Jared Frank
Umaa Rebbapragada
James Bialas
Thomas Oommen
Timothy C. Havens
author_sort Jared Frank
collection DOAJ
description Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, we look at how mislabeled training data, or label noise, impact the quality of rubble classifiers operating on high-resolution remotely-sensed images. We first study how label noise dependent on geospatial proximity, or geospatial label noise, compares to standard random noise. Our study shows that classifiers that are robust to random noise are more susceptible to geospatial label noise. We then compare the effects of label noise on both pixel- and object-based remote sensing classification paradigms. While object-based classifiers are known to outperform their pixel-based counterparts, this study demonstrates that they are more susceptible to geospatial label noise. We also introduce a new labeling tool to enhance precision and image coverage. This work has important implications for the Sendai framework as autonomous damage classification will ensure rapid disaster assessment and contribute to the minimization of disaster risk.
first_indexed 2024-04-11T18:03:16Z
format Article
id doaj.art-9ea0f7fbbd49424fbeed6d6ac975e26c
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-04-11T18:03:16Z
publishDate 2017-08-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-9ea0f7fbbd49424fbeed6d6ac975e26c2022-12-22T04:10:23ZengMDPI AGRemote Sensing2072-42922017-08-019880310.3390/rs9080803rs9080803Effect of Label Noise on the Machine-Learned Classification of Earthquake DamageJared Frank0Umaa Rebbapragada1James Bialas2Thomas Oommen3Timothy C. Havens4Department of Computer Science, Cornell University, 402 Gates Hall, Ithaca, NY 14850, USAJet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAAutomated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, we look at how mislabeled training data, or label noise, impact the quality of rubble classifiers operating on high-resolution remotely-sensed images. We first study how label noise dependent on geospatial proximity, or geospatial label noise, compares to standard random noise. Our study shows that classifiers that are robust to random noise are more susceptible to geospatial label noise. We then compare the effects of label noise on both pixel- and object-based remote sensing classification paradigms. While object-based classifiers are known to outperform their pixel-based counterparts, this study demonstrates that they are more susceptible to geospatial label noise. We also introduce a new labeling tool to enhance precision and image coverage. This work has important implications for the Sendai framework as autonomous damage classification will ensure rapid disaster assessment and contribute to the minimization of disaster risk.https://www.mdpi.com/2072-4292/9/8/803machine learningclassificationcrowdsourcingearthquake damagedamage detectionGEOBIAmislabeled training data
spellingShingle Jared Frank
Umaa Rebbapragada
James Bialas
Thomas Oommen
Timothy C. Havens
Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
Remote Sensing
machine learning
classification
crowdsourcing
earthquake damage
damage detection
GEOBIA
mislabeled training data
title Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_full Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_fullStr Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_full_unstemmed Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_short Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_sort effect of label noise on the machine learned classification of earthquake damage
topic machine learning
classification
crowdsourcing
earthquake damage
damage detection
GEOBIA
mislabeled training data
url https://www.mdpi.com/2072-4292/9/8/803
work_keys_str_mv AT jaredfrank effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage
AT umaarebbapragada effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage
AT jamesbialas effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage
AT thomasoommen effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage
AT timothychavens effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage