Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette Valley

Forest species classifications are becoming increasingly automated as advances are made in machine learning. Complex algorithms can reach high accuracies, but are not always suitable for small-scale classifications, which may benefit from simpler conventional methods. The goal of this classification...

Full description

Bibliographic Details
Main Authors: Audrey P. Riddell, Stephen A. Fitzgerald, Chu Qi, Bogdan M. Strimbu
Format: Article
Language:English
Published: MDPI AG 2020-10-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/12/20/3325
_version_ 1797551243498881024
author Audrey P. Riddell
Stephen A. Fitzgerald
Chu Qi
Bogdan M. Strimbu
author_facet Audrey P. Riddell
Stephen A. Fitzgerald
Chu Qi
Bogdan M. Strimbu
author_sort Audrey P. Riddell
collection DOAJ
description Forest species classifications are becoming increasingly automated as advances are made in machine learning. Complex algorithms can reach high accuracies, but are not always suitable for small-scale classifications, which may benefit from simpler conventional methods. The goal of this classification was to identify contiguous stands of ponderosa pine (<i>Pinus ponderosa Douglas ex Lawson</i>) against a mix of forest and non-forest background in the southern Willamette Valley, Oregon. The study area is approximately 816,600 ha, considerably larger than most study areas used for presenting techniques for tree species classification. To achieve the objective, we used two classification procedures, one parametric and one non-parametric. For the parametric method, we selected the maximum likelihood (ML) algorithm, whereas for the non-parametric method we chose the random forest (RF) algorithm. To identify ponderosa pine, we used 1 m spatial resolution red-green-blue-infrared (RGBI) aerial images supplied by the U.S. National Agriculture Imagery Program (NAIP) and 1 m spatial resolution canopy height models (CHMs) provided by the Oregon Department of Geology and Mineral Industries (DOGAMI). We tested four data variations for each method: Aerial imagery, CHM-masked aerial imagery, aerial imagery with an additional CHM band, and CHM-masked aerial imagery with a CHM band. The parametric classifications of aerial imagery alone reached an average kappa coefficient of 0.29, which increased to 0.51 when masked with CHM data. The incorporation of CHM data as a fifth band resulted in a similar improvement in kappa (0.47), but the most effective parametric method was the incorporation of CHM data as both a fifth band and a post-classification mask, resulting in a kappa coefficient of 0.89. The non-parametric classification of aerial imagery achieved a mean validation kappa coefficient of 0.85 collectively and 0.90 individually, which only increased by approximately 0.01 or less when the CHM masks were applied. The addition of the CHM band increased the kappa value to 0.91 for both individual and collective tile classifications. The highest kappa of all methods was achieved through five-band non-parametric classification with the addition of the CHM band (0.94) for both collective and individual classifications. Our results suggest that parametric methods, when enhanced with a CHM mask, could be suitable for large-area, small-scale classifications based on RGBI imagery, but a non-parametric classification of fused spectral and height data will generally achieve the highest accuracy for large, unbalanced datasets.
first_indexed 2024-03-10T15:41:48Z
format Article
id doaj.art-cedc2595323c4efc8e43f49e5165019f
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T15:41:48Z
publishDate 2020-10-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-cedc2595323c4efc8e43f49e5165019f2023-11-20T16:49:13ZengMDPI AGRemote Sensing2072-42922020-10-011220332510.3390/rs12203325Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette ValleyAudrey P. Riddell0Stephen A. Fitzgerald1Chu Qi2Bogdan M. Strimbu3Division of Environmental Planning, California Department of Transportation, 100 S Main, Los Angeles, CA 90012, USADepartment of Forest Engineering, Resources, and Management, Oregon State University, 280 Peavy Hall, Corvallis, OR 97331, USADepartment of Forest Engineering, Resources, and Management, Oregon State University, 280 Peavy Hall, Corvallis, OR 97331, USADepartment of Forest Engineering, Resources, and Management, Oregon State University, 280 Peavy Hall, Corvallis, OR 97331, USAForest species classifications are becoming increasingly automated as advances are made in machine learning. Complex algorithms can reach high accuracies, but are not always suitable for small-scale classifications, which may benefit from simpler conventional methods. The goal of this classification was to identify contiguous stands of ponderosa pine (<i>Pinus ponderosa Douglas ex Lawson</i>) against a mix of forest and non-forest background in the southern Willamette Valley, Oregon. The study area is approximately 816,600 ha, considerably larger than most study areas used for presenting techniques for tree species classification. To achieve the objective, we used two classification procedures, one parametric and one non-parametric. For the parametric method, we selected the maximum likelihood (ML) algorithm, whereas for the non-parametric method we chose the random forest (RF) algorithm. To identify ponderosa pine, we used 1 m spatial resolution red-green-blue-infrared (RGBI) aerial images supplied by the U.S. National Agriculture Imagery Program (NAIP) and 1 m spatial resolution canopy height models (CHMs) provided by the Oregon Department of Geology and Mineral Industries (DOGAMI). We tested four data variations for each method: Aerial imagery, CHM-masked aerial imagery, aerial imagery with an additional CHM band, and CHM-masked aerial imagery with a CHM band. The parametric classifications of aerial imagery alone reached an average kappa coefficient of 0.29, which increased to 0.51 when masked with CHM data. The incorporation of CHM data as a fifth band resulted in a similar improvement in kappa (0.47), but the most effective parametric method was the incorporation of CHM data as both a fifth band and a post-classification mask, resulting in a kappa coefficient of 0.89. The non-parametric classification of aerial imagery achieved a mean validation kappa coefficient of 0.85 collectively and 0.90 individually, which only increased by approximately 0.01 or less when the CHM masks were applied. The addition of the CHM band increased the kappa value to 0.91 for both individual and collective tile classifications. The highest kappa of all methods was achieved through five-band non-parametric classification with the addition of the CHM band (0.94) for both collective and individual classifications. Our results suggest that parametric methods, when enhanced with a CHM mask, could be suitable for large-area, small-scale classifications based on RGBI imagery, but a non-parametric classification of fused spectral and height data will generally achieve the highest accuracy for large, unbalanced datasets.https://www.mdpi.com/2072-4292/12/20/3325tree species classificationbinary mapsmaximum likelihoodrandom forestcanopy height model
spellingShingle Audrey P. Riddell
Stephen A. Fitzgerald
Chu Qi
Bogdan M. Strimbu
Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette Valley
Remote Sensing
tree species classification
binary maps
maximum likelihood
random forest
canopy height model
title Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette Valley
title_full Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette Valley
title_fullStr Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette Valley
title_full_unstemmed Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette Valley
title_short Classification Strategies for Unbalanced Binary Maps: Finding Ponderosa Pine (<i>Pinus ponderosa</i>) in the Willamette Valley
title_sort classification strategies for unbalanced binary maps finding ponderosa pine i pinus ponderosa i in the willamette valley
topic tree species classification
binary maps
maximum likelihood
random forest
canopy height model
url https://www.mdpi.com/2072-4292/12/20/3325
work_keys_str_mv AT audreypriddell classificationstrategiesforunbalancedbinarymapsfindingponderosapineipinusponderosaiinthewillamettevalley
AT stephenafitzgerald classificationstrategiesforunbalancedbinarymapsfindingponderosapineipinusponderosaiinthewillamettevalley
AT chuqi classificationstrategiesforunbalancedbinarymapsfindingponderosapineipinusponderosaiinthewillamettevalley
AT bogdanmstrimbu classificationstrategiesforunbalancedbinarymapsfindingponderosapineipinusponderosaiinthewillamettevalley