Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage
Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the u...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2017-08-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/9/8/803 |
_version_ | 1798024491400429568 |
---|---|
author | Jared Frank Umaa Rebbapragada James Bialas Thomas Oommen Timothy C. Havens |
author_facet | Jared Frank Umaa Rebbapragada James Bialas Thomas Oommen Timothy C. Havens |
author_sort | Jared Frank |
collection | DOAJ |
description | Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, we look at how mislabeled training data, or label noise, impact the quality of rubble classifiers operating on high-resolution remotely-sensed images. We first study how label noise dependent on geospatial proximity, or geospatial label noise, compares to standard random noise. Our study shows that classifiers that are robust to random noise are more susceptible to geospatial label noise. We then compare the effects of label noise on both pixel- and object-based remote sensing classification paradigms. While object-based classifiers are known to outperform their pixel-based counterparts, this study demonstrates that they are more susceptible to geospatial label noise. We also introduce a new labeling tool to enhance precision and image coverage. This work has important implications for the Sendai framework as autonomous damage classification will ensure rapid disaster assessment and contribute to the minimization of disaster risk. |
first_indexed | 2024-04-11T18:03:16Z |
format | Article |
id | doaj.art-9ea0f7fbbd49424fbeed6d6ac975e26c |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-04-11T18:03:16Z |
publishDate | 2017-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-9ea0f7fbbd49424fbeed6d6ac975e26c2022-12-22T04:10:23ZengMDPI AGRemote Sensing2072-42922017-08-019880310.3390/rs9080803rs9080803Effect of Label Noise on the Machine-Learned Classification of Earthquake DamageJared Frank0Umaa Rebbapragada1James Bialas2Thomas Oommen3Timothy C. Havens4Department of Computer Science, Cornell University, 402 Gates Hall, Ithaca, NY 14850, USAJet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAMichigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USAAutomated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, we look at how mislabeled training data, or label noise, impact the quality of rubble classifiers operating on high-resolution remotely-sensed images. We first study how label noise dependent on geospatial proximity, or geospatial label noise, compares to standard random noise. Our study shows that classifiers that are robust to random noise are more susceptible to geospatial label noise. We then compare the effects of label noise on both pixel- and object-based remote sensing classification paradigms. While object-based classifiers are known to outperform their pixel-based counterparts, this study demonstrates that they are more susceptible to geospatial label noise. We also introduce a new labeling tool to enhance precision and image coverage. This work has important implications for the Sendai framework as autonomous damage classification will ensure rapid disaster assessment and contribute to the minimization of disaster risk.https://www.mdpi.com/2072-4292/9/8/803machine learningclassificationcrowdsourcingearthquake damagedamage detectionGEOBIAmislabeled training data |
spellingShingle | Jared Frank Umaa Rebbapragada James Bialas Thomas Oommen Timothy C. Havens Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage Remote Sensing machine learning classification crowdsourcing earthquake damage damage detection GEOBIA mislabeled training data |
title | Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage |
title_full | Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage |
title_fullStr | Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage |
title_full_unstemmed | Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage |
title_short | Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage |
title_sort | effect of label noise on the machine learned classification of earthquake damage |
topic | machine learning classification crowdsourcing earthquake damage damage detection GEOBIA mislabeled training data |
url | https://www.mdpi.com/2072-4292/9/8/803 |
work_keys_str_mv | AT jaredfrank effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage AT umaarebbapragada effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage AT jamesbialas effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage AT thomasoommen effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage AT timothychavens effectoflabelnoiseonthemachinelearnedclassificationofearthquakedamage |