Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage

Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the u...

Full description

Bibliographic Details
Main Authors:	Jared Frank, Umaa Rebbapragada, James Bialas, Thomas Oommen, Timothy C. Havens
Format:	Article
Language:	English
Published:	MDPI AG 2017-08-01
Series:	Remote Sensing
Subjects:	machine learning classification crowdsourcing earthquake damage damage detection GEOBIA mislabeled training data
Online Access:	https://www.mdpi.com/2072-4292/9/8/803

_version_	1798024491400429568
author	Jared Frank Umaa Rebbapragada James Bialas Thomas Oommen Timothy C. Havens
author_facet	Jared Frank Umaa Rebbapragada James Bialas Thomas Oommen Timothy C. Havens
author_sort	Jared Frank
collection	DOAJ
description	Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, we look at how mislabeled training data, or label noise, impact the quality of rubble classifiers operating on high-resolution remotely-sensed images. We first study how label noise dependent on geospatial proximity, or geospatial label noise, compares to standard random noise. Our study shows that classifiers that are robust to random noise are more susceptible to geospatial label noise. We then compare the effects of label noise on both pixel- and object-based remote sensing classification paradigms. While object-based classifiers are known to outperform their pixel-based counterparts, this study demonstrates that they are more susceptible to geospatial label noise. We also introduce a new labeling tool to enhance precision and image coverage. This work has important implications for the Sendai framework as autonomous damage classification will ensure rapid disaster assessment and contribute to the minimization of disaster risk.
first_indexed	2024-04-11T18:03:16Z
format	Article
id	doaj.art-9ea0f7fbbd49424fbeed6d6ac975e26c
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-04-11T18:03:16Z
publishDate	2017-08-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-9ea0f7fbbd49424fbeed6d6ac975e26c2022-12-22T04:10:23ZengMDPI AGRemote Sensing2072-42922017-08-019880310.3390/rs9080803rs9080803Effect of Label Noise on the Machine-Learned Classification of Earthquake DamageJared Frank0Umaa Rebbapragada1James Bialas2Thomas Oommen3Timothy C. Havens4Department of Computer Science, Cornell University, 402 Gates Hall, Ithaca, NY 14850, USAJet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAAutomated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, we look at how mislabeled training data, or label noise, impact the quality of rubble classifiers operating on high-resolution remotely-sensed images. We first study how label noise dependent on geospatial proximity, or geospatial label noise, compares to standard random noise. Our study shows that classifiers that are robust to random noise are more susceptible to geospatial label noise. We then compare the effects of label noise on both pixel- and object-based remote sensing classification paradigms. While object-based classifiers are known to outperform their pixel-based counterparts, this study demonstrates that they are more susceptible to geospatial label noise. We also introduce a new labeling tool to enhance precision and image coverage. This work has important implications for the Sendai framework as autonomous damage classification will ensure rapid disaster assessment and contribute to the minimization of disaster risk.https://www.mdpi.com/2072-4292/9/8/803machine learningclassificationcrowdsourcingearthquake damagedamage detectionGEOBIAmislabeled training data
spellingShingle	Jared Frank Umaa Rebbapragada James Bialas Thomas Oommen Timothy C. Havens Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage Remote Sensing machine learning classification crowdsourcing earthquake damage damage detection GEOBIA mislabeled training data
title	Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_full	Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_fullStr	Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_full_unstemmed	Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_short	Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
title_sort	effect of label noise on the machine learned classification of earthquake damage
topic	machine learning classification crowdsourcing earthquake damage damage detection GEOBIA mislabeled training data
url	https://www.mdpi.com/2072-4292/9/8/803
work_keys_str_mv	AT jaredfrank effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage AT umaarebbapragada effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage AT jamesbialas effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage AT thomasoommen effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage AT timothychavens effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage

Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage

Similar Items