A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest
Remote sensing technology is becoming mainstream for mapping the growing stem volume (GSV) and overcoming the shortage of traditional labor-consumed approaches. Naturally, the GSV estimation accuracy utilizing remote sensing imagery is highly related to the variable selection methods and algorithms....
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-11-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/13/22/4631 |
_version_ | 1797508570483261440 |
---|---|
author | Xiaodong Xu Hui Lin Zhaohua Liu Zilin Ye Xinyu Li Jiangping Long |
author_facet | Xiaodong Xu Hui Lin Zhaohua Liu Zilin Ye Xinyu Li Jiangping Long |
author_sort | Xiaodong Xu |
collection | DOAJ |
description | Remote sensing technology is becoming mainstream for mapping the growing stem volume (GSV) and overcoming the shortage of traditional labor-consumed approaches. Naturally, the GSV estimation accuracy utilizing remote sensing imagery is highly related to the variable selection methods and algorithms. Thus, to reduce the uncertainty caused by variables and models, this paper proposes a combined strategy involving improved variable selection with the collinearity test and the secondary ensemble algorithm to obtain the optimally combined variables and extract a reliable GSV from several base models. Our study extracted four types of alternative variables from the Sentinel-1A and Sentinel-2A image datasets, including vegetation indices, spectral reflectance variables, backscattering coefficients, and texture features. Then, an improved variable selection criterion with the collinearity test was developed and evaluated based on machine learning algorithms (classification and regression trees (CART), k-nearest neighbors (KNN), support vector regression (SVR), and artificial neural network (ANN)) considering the correlation between variables and GSV (with random forest (RF), distance correlation coefficient (DC), maximal information coefficient (MIC), and Pearson correlation coefficient (PCC) as evaluation metrics), and the collinearity among the variables. Additionally, we proposed a secondary ensemble with an improved weighted average approach (IWA) to estimate the reliable forest GSV using the first ensemble models constructed by Bagging and AdaBoost. The experimental results demonstrated that the proposed variable selection criterion efficiently obtained the optimal combined variable set without affecting the forest GSV mapping accuracy. Specifically, considering the first ensemble, the relative root mean square error (rRMSE) values ranged from 21.91% to 30.28% for Bagging and 23.33% to 31.49% for AdaBoost, respectively. After the secondary ensemble involving the IWA, the rRMSE values ranged from 18.89% to 21.34%. Furthermore, the variance of the GSV mapped by the secondary ensemble with various ranking methods was significantly reduced. The results prove that the proposed combined strategy has great potential to reduce the GSV mapping uncertainty imposed by current variable selection approaches and algorithms. |
first_indexed | 2024-03-10T05:05:49Z |
format | Article |
id | doaj.art-362c77df3cf240a9b98af9821f064afd |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-10T05:05:49Z |
publishDate | 2021-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-362c77df3cf240a9b98af9821f064afd2023-11-23T01:20:47ZengMDPI AGRemote Sensing2072-42922021-11-011322463110.3390/rs13224631A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous ForestXiaodong Xu0Hui Lin1Zhaohua Liu2Zilin Ye3Xinyu Li4Jiangping Long5Research Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaResearch Center of Forestry Remote Sensing & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, ChinaRemote sensing technology is becoming mainstream for mapping the growing stem volume (GSV) and overcoming the shortage of traditional labor-consumed approaches. Naturally, the GSV estimation accuracy utilizing remote sensing imagery is highly related to the variable selection methods and algorithms. Thus, to reduce the uncertainty caused by variables and models, this paper proposes a combined strategy involving improved variable selection with the collinearity test and the secondary ensemble algorithm to obtain the optimally combined variables and extract a reliable GSV from several base models. Our study extracted four types of alternative variables from the Sentinel-1A and Sentinel-2A image datasets, including vegetation indices, spectral reflectance variables, backscattering coefficients, and texture features. Then, an improved variable selection criterion with the collinearity test was developed and evaluated based on machine learning algorithms (classification and regression trees (CART), k-nearest neighbors (KNN), support vector regression (SVR), and artificial neural network (ANN)) considering the correlation between variables and GSV (with random forest (RF), distance correlation coefficient (DC), maximal information coefficient (MIC), and Pearson correlation coefficient (PCC) as evaluation metrics), and the collinearity among the variables. Additionally, we proposed a secondary ensemble with an improved weighted average approach (IWA) to estimate the reliable forest GSV using the first ensemble models constructed by Bagging and AdaBoost. The experimental results demonstrated that the proposed variable selection criterion efficiently obtained the optimal combined variable set without affecting the forest GSV mapping accuracy. Specifically, considering the first ensemble, the relative root mean square error (rRMSE) values ranged from 21.91% to 30.28% for Bagging and 23.33% to 31.49% for AdaBoost, respectively. After the secondary ensemble involving the IWA, the rRMSE values ranged from 18.89% to 21.34%. Furthermore, the variance of the GSV mapped by the secondary ensemble with various ranking methods was significantly reduced. The results prove that the proposed combined strategy has great potential to reduce the GSV mapping uncertainty imposed by current variable selection approaches and algorithms.https://www.mdpi.com/2072-4292/13/22/4631growing stem volumesentinelvariable selectionensemble algorithm |
spellingShingle | Xiaodong Xu Hui Lin Zhaohua Liu Zilin Ye Xinyu Li Jiangping Long A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest Remote Sensing growing stem volume sentinel variable selection ensemble algorithm |
title | A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest |
title_full | A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest |
title_fullStr | A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest |
title_full_unstemmed | A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest |
title_short | A Combined Strategy of Improved Variable Selection and Ensemble Algorithm to Map the Growing Stem Volume of Planted Coniferous Forest |
title_sort | combined strategy of improved variable selection and ensemble algorithm to map the growing stem volume of planted coniferous forest |
topic | growing stem volume sentinel variable selection ensemble algorithm |
url | https://www.mdpi.com/2072-4292/13/22/4631 |
work_keys_str_mv | AT xiaodongxu acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT huilin acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT zhaohualiu acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT zilinye acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT xinyuli acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT jiangpinglong acombinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT xiaodongxu combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT huilin combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT zhaohualiu combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT zilinye combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT xinyuli combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest AT jiangpinglong combinedstrategyofimprovedvariableselectionandensemblealgorithmtomapthegrowingstemvolumeofplantedconiferousforest |