A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest

Remote sensing technology is becoming mainstream for mapping the growing stem volume (GSV) and overcoming the shortage of traditional labor-consumed approaches. Naturally, the GSV estimation accuracy utilizing remote sensing imagery is highly related to the variable selection methods and algorithms....

Full description

Bibliographic Details
Main Authors: Xiaodong Xu, Hui Lin, Zhaohua Liu, Zilin Ye, Xinyu Li, Jiangping Long
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/13/22/4631
_version_ 1797508570483261440
author Xiaodong Xu
Hui Lin
Zhaohua Liu
Zilin Ye
Xinyu Li
Jiangping Long
author_facet Xiaodong Xu
Hui Lin
Zhaohua Liu
Zilin Ye
Xinyu Li
Jiangping Long
author_sort Xiaodong Xu
collection DOAJ
description Remote sensing technology is becoming mainstream for mapping the growing stem volume (GSV) and overcoming the shortage of traditional labor-consumed approaches. Naturally, the GSV estimation accuracy utilizing remote sensing imagery is highly related to the variable selection methods and algorithms. Thus, to reduce the uncertainty caused by variables and models, this paper proposes a combined strategy involving improved variable selection with the collinearity test and the secondary ensemble algorithm to obtain the optimally combined variables and extract a reliable GSV from several base models. Our study extracted four types of alternative variables from the Sentinel-1A and Sentinel-2A image datasets, including vegetation indices, spectral reflectance variables, backscattering coefficients, and texture features. Then, an improved variable selection criterion with the collinearity test was developed and evaluated based on machine learning algorithms (classification and regression trees (CART), k-nearest neighbors (KNN), support vector regression (SVR), and artificial neural network (ANN)) considering the correlation between variables and GSV (with random forest (RF), distance correlation coefficient (DC), maximal information coefficient (MIC), and Pearson correlation coefficient (PCC) as evaluation metrics), and the collinearity among the variables. Additionally, we proposed a secondary ensemble with an improved weighted average approach (IWA) to estimate the reliable forest GSV using the first ensemble models constructed by Bagging and AdaBoost. The experimental results demonstrated that the proposed variable selection criterion efficiently obtained the optimal combined variable set without affecting the forest GSV mapping accuracy. Specifically, considering the first ensemble, the relative root mean square error (rRMSE) values ranged from 21.91% to 30.28% for Bagging and 23.33% to 31.49% for AdaBoost, respectively. After the secondary ensemble involving the IWA, the rRMSE values ranged from 18.89% to 21.34%. Furthermore, the variance of the GSV mapped by the secondary ensemble with various ranking methods was significantly reduced. The results prove that the proposed combined strategy has great potential to reduce the GSV mapping uncertainty imposed by current variable selection approaches and algorithms.
first_indexed 2024-03-10T05:05:49Z
format Article
id doaj.art-362c77df3cf240a9b98af9821f064afd
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T05:05:49Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-362c77df3cf240a9b98af9821f064afd2023-11-23T01:20:47ZengMDPI AGRemote Sensing2072-42922021-11-011322463110.3390/rs13224631A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous ForestXiaodong Xu0Hui Lin1Zhaohua Liu2Zilin Ye3Xinyu Li4Jiangping Long5Research Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaRemote sensing technology is becoming mainstream for mapping the growing stem volume (GSV) and overcoming the shortage of traditional labor-consumed approaches. Naturally, the GSV estimation accuracy utilizing remote sensing imagery is highly related to the variable selection methods and algorithms. Thus, to reduce the uncertainty caused by variables and models, this paper proposes a combined strategy involving improved variable selection with the collinearity test and the secondary ensemble algorithm to obtain the optimally combined variables and extract a reliable GSV from several base models. Our study extracted four types of alternative variables from the Sentinel-1A and Sentinel-2A image datasets, including vegetation indices, spectral reflectance variables, backscattering coefficients, and texture features. Then, an improved variable selection criterion with the collinearity test was developed and evaluated based on machine learning algorithms (classification and regression trees (CART), k-nearest neighbors (KNN), support vector regression (SVR), and artificial neural network (ANN)) considering the correlation between variables and GSV (with random forest (RF), distance correlation coefficient (DC), maximal information coefficient (MIC), and Pearson correlation coefficient (PCC) as evaluation metrics), and the collinearity among the variables. Additionally, we proposed a secondary ensemble with an improved weighted average approach (IWA) to estimate the reliable forest GSV using the first ensemble models constructed by Bagging and AdaBoost. The experimental results demonstrated that the proposed variable selection criterion efficiently obtained the optimal combined variable set without affecting the forest GSV mapping accuracy. Specifically, considering the first ensemble, the relative root mean square error (rRMSE) values ranged from 21.91% to 30.28% for Bagging and 23.33% to 31.49% for AdaBoost, respectively. After the secondary ensemble involving the IWA, the rRMSE values ranged from 18.89% to 21.34%. Furthermore, the variance of the GSV mapped by the secondary ensemble with various ranking methods was significantly reduced. The results prove that the proposed combined strategy has great potential to reduce the GSV mapping uncertainty imposed by current variable selection approaches and algorithms.https://www.mdpi.com/2072-4292/13/22/4631growing stem volumesentinelvariable selectionensemble algorithm
spellingShingle Xiaodong Xu
Hui Lin
Zhaohua Liu
Zilin Ye
Xinyu Li
Jiangping Long
A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest
Remote Sensing
growing stem volume
sentinel
variable selection
ensemble algorithm
title A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest
title_full A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest
title_fullStr A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest
title_full_unstemmed A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest
title_short A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest
title_sort combined strategy of improved variable selection and ensemble algorithm to map the growing stem volume of planted coniferous forest
topic growing stem volume
sentinel
variable selection
ensemble algorithm
url https://www.mdpi.com/2072-4292/13/22/4631
work_keys_str_mv AT xiaodongxu acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT huilin acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT zhaohualiu acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT zilinye acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT xinyuli acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT jiangpinglong acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT xiaodongxu combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT huilin combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT zhaohualiu combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT zilinye combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT xinyuli combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest
AT jiangpinglong combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest