Exploring Transferable Techniques to Retrieve Crop Biophysical and Biochemical Variables Using Sentinel-2 Data
The current study aimed to determine the spatial transferability of eXtreme Gradient Boosting (XGBoost) models for estimating biophysical and biochemical variables (BVs), using Sentinel-2 data. The specific objectives were to: (1) assess the effect of different proportions of training samples (i.e.,...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-08-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/14/16/3968 |
_version_ | 1797408089130926080 |
---|---|
author | Mahlatse Kganyago Clement Adjorlolo Paidamwoyo Mhangara |
author_facet | Mahlatse Kganyago Clement Adjorlolo Paidamwoyo Mhangara |
author_sort | Mahlatse Kganyago |
collection | DOAJ |
description | The current study aimed to determine the spatial transferability of eXtreme Gradient Boosting (XGBoost) models for estimating biophysical and biochemical variables (BVs), using Sentinel-2 data. The specific objectives were to: (1) assess the effect of different proportions of training samples (i.e., 25%, 50%, and 75%) available at the Target site (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) on the spatial transferability of the XGBoost models and (2) evaluate the effect of the Source site (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) (i.e., trained) model accuracy on the Target site (i.e., unseen) retrieval uncertainty. The results showed that the Bothaville (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) → Harrismith (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) Leaf Area Index (LAI) models required only fewer proportions, i.e., 25% or 50%, of the training samples to make optimal retrievals in the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula> (i.e., RMSE: 0.61 m<sup>2</sup> m<sup>−2</sup>; <i>R</i><sup>2</sup>: 59%), while Harrismith (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) →Bothaville (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) LAI models required up to 75% of training samples in the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula> to obtain optimal LAI retrievals (i.e., RMSE = 0.63 m<sup>2</sup> m<sup>−2</sup>; <i>R</i><sup>2</sup> = 67%). In contrast, the chlorophyll content models for Bothaville (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) → Harrismith (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) required significant proportions of samples (i.e., 75%) from the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula> to make optimal retrievals of Leaf Chlorophyll Content (LC<b><i><sub>ab</sub></i></b>) (i.e., RMSE: 7.09 µg cm<sup>−2</sup>; <i>R</i><sup>2</sup>: 58%) and Canopy Chlorophyll Content (CCC) (i.e., RMSE: 36.3 µg cm<sup>−2</sup>; <i>R</i><sup>2</sup>: 61%), while Harrismith (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) →Bothaville (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) models required only 25% of the samples to achieve RMSEs of 8.16 µg cm<sup>−2</sup> (<i>R</i><sup>2</sup>: 83%) and 40.25 µg cm<sup>−2</sup> (<i>R</i><sup>2</sup>: 77%), for LC<b><i><sub>ab</sub></i></b> and CCC, respectively. The results also showed that the source site model accuracy led to better transferability for LAI retrievals. In contrast, the accuracy of LC<b><i><sub>ab</sub></i></b> and CCC source site models did not necessarily improve their transferability. Overall, the results elucidate the potential of transferable Machine Learning Regression Algorithms and are significant for the rapid retrieval of important crop BVs in data-scarce areas, thus facilitating spatially-explicit information for site-specific farm management. |
first_indexed | 2024-03-09T03:53:18Z |
format | Article |
id | doaj.art-3780061543514a70ad9088221e042d26 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-09T03:53:18Z |
publishDate | 2022-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-3780061543514a70ad9088221e042d262023-12-03T14:24:25ZengMDPI AGRemote Sensing2072-42922022-08-011416396810.3390/rs14163968Exploring Transferable Techniques to Retrieve Crop Biophysical and Biochemical Variables Using Sentinel-2 DataMahlatse Kganyago0Clement Adjorlolo1Paidamwoyo Mhangara2School of Geography, Archaeology and Environmental Studies, University of the Witwatersrand, Johannesburg 2050, South AfricaSchool of Geography, Archaeology and Environmental Studies, University of the Witwatersrand, Johannesburg 2050, South AfricaSchool of Geography, Archaeology and Environmental Studies, University of the Witwatersrand, Johannesburg 2050, South AfricaThe current study aimed to determine the spatial transferability of eXtreme Gradient Boosting (XGBoost) models for estimating biophysical and biochemical variables (BVs), using Sentinel-2 data. The specific objectives were to: (1) assess the effect of different proportions of training samples (i.e., 25%, 50%, and 75%) available at the Target site (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) on the spatial transferability of the XGBoost models and (2) evaluate the effect of the Source site (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) (i.e., trained) model accuracy on the Target site (i.e., unseen) retrieval uncertainty. The results showed that the Bothaville (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) → Harrismith (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) Leaf Area Index (LAI) models required only fewer proportions, i.e., 25% or 50%, of the training samples to make optimal retrievals in the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula> (i.e., RMSE: 0.61 m<sup>2</sup> m<sup>−2</sup>; <i>R</i><sup>2</sup>: 59%), while Harrismith (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) →Bothaville (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) LAI models required up to 75% of training samples in the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula> to obtain optimal LAI retrievals (i.e., RMSE = 0.63 m<sup>2</sup> m<sup>−2</sup>; <i>R</i><sup>2</sup> = 67%). In contrast, the chlorophyll content models for Bothaville (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) → Harrismith (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) required significant proportions of samples (i.e., 75%) from the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula> to make optimal retrievals of Leaf Chlorophyll Content (LC<b><i><sub>ab</sub></i></b>) (i.e., RMSE: 7.09 µg cm<sup>−2</sup>; <i>R</i><sup>2</sup>: 58%) and Canopy Chlorophyll Content (CCC) (i.e., RMSE: 36.3 µg cm<sup>−2</sup>; <i>R</i><sup>2</sup>: 61%), while Harrismith (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula>) →Bothaville (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi mathvariant="script">D</mi><mi>T</mi></msub></mrow></semantics></math></inline-formula>) models required only 25% of the samples to achieve RMSEs of 8.16 µg cm<sup>−2</sup> (<i>R</i><sup>2</sup>: 83%) and 40.25 µg cm<sup>−2</sup> (<i>R</i><sup>2</sup>: 77%), for LC<b><i><sub>ab</sub></i></b> and CCC, respectively. The results also showed that the source site model accuracy led to better transferability for LAI retrievals. In contrast, the accuracy of LC<b><i><sub>ab</sub></i></b> and CCC source site models did not necessarily improve their transferability. Overall, the results elucidate the potential of transferable Machine Learning Regression Algorithms and are significant for the rapid retrieval of important crop BVs in data-scarce areas, thus facilitating spatially-explicit information for site-specific farm management.https://www.mdpi.com/2072-4292/14/16/3968spatial transferabilitymachine learningleaf area indexprecision agriculturechlorophyll contentSentinel-2 |
spellingShingle | Mahlatse Kganyago Clement Adjorlolo Paidamwoyo Mhangara Exploring Transferable Techniques to Retrieve Crop Biophysical and Biochemical Variables Using Sentinel-2 Data Remote Sensing spatial transferability machine learning leaf area index precision agriculture chlorophyll content Sentinel-2 |
title | Exploring Transferable Techniques to Retrieve Crop Biophysical and Biochemical Variables Using Sentinel-2 Data |
title_full | Exploring Transferable Techniques to Retrieve Crop Biophysical and Biochemical Variables Using Sentinel-2 Data |
title_fullStr | Exploring Transferable Techniques to Retrieve Crop Biophysical and Biochemical Variables Using Sentinel-2 Data |
title_full_unstemmed | Exploring Transferable Techniques to Retrieve Crop Biophysical and Biochemical Variables Using Sentinel-2 Data |
title_short | Exploring Transferable Techniques to Retrieve Crop Biophysical and Biochemical Variables Using Sentinel-2 Data |
title_sort | exploring transferable techniques to retrieve crop biophysical and biochemical variables using sentinel 2 data |
topic | spatial transferability machine learning leaf area index precision agriculture chlorophyll content Sentinel-2 |
url | https://www.mdpi.com/2072-4292/14/16/3968 |
work_keys_str_mv | AT mahlatsekganyago exploringtransferabletechniquestoretrievecropbiophysicalandbiochemicalvariablesusingsentinel2data AT clementadjorlolo exploringtransferabletechniquestoretrievecropbiophysicalandbiochemicalvariablesusingsentinel2data AT paidamwoyomhangara exploringtransferabletechniquestoretrievecropbiophysicalandbiochemicalvariablesusingsentinel2data |