Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods

Leaf area index (LAI) is an essential vegetation parameter that represents the light energy utilization and vegetation canopy structure. As the only in-operation hyperspectral satellite launched by China, GF-5 is potentially useful for accurate LAI estimation. However, there is no research focus on...

Full description

Bibliographic Details
Main Authors: Zhulin Chen, Kun Jia, Chenchao Xiao, Dandan Wei, Xiang Zhao, Jinhui Lan, Xiangqin Wei, Yunjun Yao, Bing Wang, Yuan Sun, Lei Wang
Format: Article
Language:English
Published: MDPI AG 2020-07-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/12/13/2110
_version_ 1797563474141773824
author Zhulin Chen
Kun Jia
Chenchao Xiao
Dandan Wei
Xiang Zhao
Jinhui Lan
Xiangqin Wei
Yunjun Yao
Bing Wang
Yuan Sun
Lei Wang
author_facet Zhulin Chen
Kun Jia
Chenchao Xiao
Dandan Wei
Xiang Zhao
Jinhui Lan
Xiangqin Wei
Yunjun Yao
Bing Wang
Yuan Sun
Lei Wang
author_sort Zhulin Chen
collection DOAJ
description Leaf area index (LAI) is an essential vegetation parameter that represents the light energy utilization and vegetation canopy structure. As the only in-operation hyperspectral satellite launched by China, GF-5 is potentially useful for accurate LAI estimation. However, there is no research focus on evaluating GF-5 data for LAI estimation. Hyperspectral remote sensing data contains abundant information about the reflective characteristics of vegetation canopies, but these abound data also easily result in a dimensionality curse. Therefore, feature selection (FS) is necessary to reduce data redundancy to achieve more reliable estimations. Currently, machine learning (ML) algorithms have been widely used for FS. Moreover, the same ML algorithm is usually conducted for both FS and regression in LAI estimation. However, no evidence suggests that this is the optimal solution. Therefore, this study focuses on evaluating the capacity of GF-5 spectral reflectance for estimating LAI and the performances of different combination of FS and ML algorithms. Firstly, the PROSAIL model, which coupled leaf optical properties model PROSPECT and the scattering by arbitrarily inclined leaves (SAIL) model, was used to generate simulated GF-5 reflectance data under different vegetation and soil conditions, and then three FS methods, including random forest (RF), K-means clustering (K-means) and mean impact value (MIV), and three ML algorithms, including random forest regression (RFR), back propagation neural network (BPNN) and K-nearest neighbor (KNN) were used to develop nine LAI estimation models. The FS process was conducted twice using different strategies: Firstly, three FS methods were conducted to search the lowest dimension number, which maintained the estimation accuracy of all bands. Then, the sequential backward selection (SBS) method was used to eliminate the bands having minimal impact on LAI estimation accuracy. Finally, three best estimation models were selected and evaluated using reference LAI. The results showed that although the RF_RFR model (RF used for feature selection and RFR used for regression) achieved reliable LAI estimates (coefficient of determination (R<sup>2</sup>) = 0.828, root mean square error (RMSE) = 0.839), the poor performance (R<sup>2</sup> = 0.763, RMSE = 0.987) of the MIV_BPNN model (MIV used for feature selection and BPNN used for regression) suggested using feature selection and regression conducted by the same ML algorithm could not always ensure an optimal estimation. Moreover, RF selection preserved the most informative bands for LAI estimation so that each ML regression method could achieve satisfactory estimation results. Finally, the results indicated that the RF_KNN model (RF used as feature selection and KNN used for regression) with seven GF-5 spectral band reflectance achieved the better estimation results than others when validated by simulated data (R<sup>2</sup> = 0.834, RMSE = 0.824) and actual reference LAI (R<sup>2</sup> = 0.659, RMSE = 0.697).
first_indexed 2024-03-10T18:44:08Z
format Article
id doaj.art-02431103d5c94576a0c9390de0d9a438
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T18:44:08Z
publishDate 2020-07-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-02431103d5c94576a0c9390de0d9a4382023-11-20T05:36:08ZengMDPI AGRemote Sensing2072-42922020-07-011213211010.3390/rs12132110Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning MethodsZhulin Chen0Kun Jia1Chenchao Xiao2Dandan Wei3Xiang Zhao4Jinhui Lan5Xiangqin Wei6Yunjun Yao7Bing Wang8Yuan Sun9Lei Wang10State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaState Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaLand Satellite Remote Sensing Application Center, Ministry of Natural Resource of the People’s Republic of China, Beijing 100048, ChinaLand Satellite Remote Sensing Application Center, Ministry of Natural Resource of the People’s Republic of China, Beijing 100048, ChinaState Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaBeijing Engineering Research Center of Industrial Spectrum Imaging, Beijing 100083, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaState Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaState Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaNorthwest National Key Laboratory Breeding Base for Land Degradation and Ecological Restoration, Ningxia University, Yinchuan 750021, ChinaLeaf area index (LAI) is an essential vegetation parameter that represents the light energy utilization and vegetation canopy structure. As the only in-operation hyperspectral satellite launched by China, GF-5 is potentially useful for accurate LAI estimation. However, there is no research focus on evaluating GF-5 data for LAI estimation. Hyperspectral remote sensing data contains abundant information about the reflective characteristics of vegetation canopies, but these abound data also easily result in a dimensionality curse. Therefore, feature selection (FS) is necessary to reduce data redundancy to achieve more reliable estimations. Currently, machine learning (ML) algorithms have been widely used for FS. Moreover, the same ML algorithm is usually conducted for both FS and regression in LAI estimation. However, no evidence suggests that this is the optimal solution. Therefore, this study focuses on evaluating the capacity of GF-5 spectral reflectance for estimating LAI and the performances of different combination of FS and ML algorithms. Firstly, the PROSAIL model, which coupled leaf optical properties model PROSPECT and the scattering by arbitrarily inclined leaves (SAIL) model, was used to generate simulated GF-5 reflectance data under different vegetation and soil conditions, and then three FS methods, including random forest (RF), K-means clustering (K-means) and mean impact value (MIV), and three ML algorithms, including random forest regression (RFR), back propagation neural network (BPNN) and K-nearest neighbor (KNN) were used to develop nine LAI estimation models. The FS process was conducted twice using different strategies: Firstly, three FS methods were conducted to search the lowest dimension number, which maintained the estimation accuracy of all bands. Then, the sequential backward selection (SBS) method was used to eliminate the bands having minimal impact on LAI estimation accuracy. Finally, three best estimation models were selected and evaluated using reference LAI. The results showed that although the RF_RFR model (RF used for feature selection and RFR used for regression) achieved reliable LAI estimates (coefficient of determination (R<sup>2</sup>) = 0.828, root mean square error (RMSE) = 0.839), the poor performance (R<sup>2</sup> = 0.763, RMSE = 0.987) of the MIV_BPNN model (MIV used for feature selection and BPNN used for regression) suggested using feature selection and regression conducted by the same ML algorithm could not always ensure an optimal estimation. Moreover, RF selection preserved the most informative bands for LAI estimation so that each ML regression method could achieve satisfactory estimation results. Finally, the results indicated that the RF_KNN model (RF used as feature selection and KNN used for regression) with seven GF-5 spectral band reflectance achieved the better estimation results than others when validated by simulated data (R<sup>2</sup> = 0.834, RMSE = 0.824) and actual reference LAI (R<sup>2</sup> = 0.659, RMSE = 0.697).https://www.mdpi.com/2072-4292/12/13/2110GF-5LAIfeature selectionmachine learning
spellingShingle Zhulin Chen
Kun Jia
Chenchao Xiao
Dandan Wei
Xiang Zhao
Jinhui Lan
Xiangqin Wei
Yunjun Yao
Bing Wang
Yuan Sun
Lei Wang
Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods
Remote Sensing
GF-5
LAI
feature selection
machine learning
title Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods
title_full Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods
title_fullStr Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods
title_full_unstemmed Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods
title_short Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods
title_sort leaf area index estimation algorithm for gf 5 hyperspectral data based on different feature selection and machine learning methods
topic GF-5
LAI
feature selection
machine learning
url https://www.mdpi.com/2072-4292/12/13/2110
work_keys_str_mv AT zhulinchen leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT kunjia leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT chenchaoxiao leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT dandanwei leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT xiangzhao leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT jinhuilan leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT xiangqinwei leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT yunjunyao leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT bingwang leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT yuansun leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods
AT leiwang leafareaindexestimationalgorithmforgf5hyperspectraldatabasedondifferentfeatureselectionandmachinelearningmethods