Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering

The estimation of the level of the soil organic carbon (SOC) content plays an important role in assessing the soil health state. Visible and Near Infrared Diffuse Reflectance Spectroscopy (Vis-NIR DRS) is a fast and cheap tool for measuring the SOC. However, when this technology is applied on a larg...

Full description

Bibliographic Details
Main Authors: Baoyang Liu, Baofeng Guo, Renxiong Zhuo, Fan Dai, Haoyu Chi
Format: Article
Language:English
Published: Czech Academy of Agricultural Sciences 2023-02-01
Series:Soil and Water Research
Subjects:
Online Access:https://swr.agriculturejournals.cz/artkey/swr-202301-0006_prediction-of-the-soil-organic-carbon-in-the-lucas-soil-database-based-on-spectral-clustering.php
_version_ 1797897037070467072
author Baoyang Liu
Baofeng Guo
Renxiong Zhuo
Fan Dai
Haoyu Chi
author_facet Baoyang Liu
Baofeng Guo
Renxiong Zhuo
Fan Dai
Haoyu Chi
author_sort Baoyang Liu
collection DOAJ
description The estimation of the level of the soil organic carbon (SOC) content plays an important role in assessing the soil health state. Visible and Near Infrared Diffuse Reflectance Spectroscopy (Vis-NIR DRS) is a fast and cheap tool for measuring the SOC. However, when this technology is applied on a larger area, the soil prediction accuracy decreases due to the heterogeneity of the samples. In this paper, we first investigate the global model performance in the LUCAS EU-wide topsoil database. Then, different clustering strategies were tested, including the k-means clustering based on the principal component analysis (PCA) and hierarchical clustering, combined with the partial least squares regression (PLSR) models, and a clustering based on a local PLSR approach. The best validation results were obtained for the local PLSR approach with R2 = 0.75, root mean squared error of prediction (RMSEP) = 13.38 g/kg and ratio of performance to interquartile range (RPIQ) = 2.846, but the algorithm running time was 30.05 s. Similar results were obtained for the k-means clustering method with R2 = 0.75, RMSEP = 14.61 g/kg and RPIQ = 2.844, at only 4.52 s. This study demonstrates that the PLSR approach based on k-means clustering is able to achieve similar prediction accuracy as the local PLSR approach, while significantly improving the algorithm speed. This provides the theoretical basis for adapting the spectral soil model to the needs of real-time SOC quantification.
first_indexed 2024-04-10T07:51:12Z
format Article
id doaj.art-a5379646696044f8a676af8a5fa3175a
institution Directory Open Access Journal
issn 1801-5395
1805-9384
language English
last_indexed 2024-04-10T07:51:12Z
publishDate 2023-02-01
publisher Czech Academy of Agricultural Sciences
record_format Article
series Soil and Water Research
spelling doaj.art-a5379646696044f8a676af8a5fa3175a2023-02-23T10:13:36ZengCzech Academy of Agricultural SciencesSoil and Water Research1801-53951805-93842023-02-01181435410.17221/97/2022-SWRswr-202301-0006Prediction of the soil organic carbon in the LUCAS soil database based on spectral clusteringBaoyang Liu0Baofeng Guo1Renxiong Zhuo2Fan Dai3Haoyu Chi4School of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaThe estimation of the level of the soil organic carbon (SOC) content plays an important role in assessing the soil health state. Visible and Near Infrared Diffuse Reflectance Spectroscopy (Vis-NIR DRS) is a fast and cheap tool for measuring the SOC. However, when this technology is applied on a larger area, the soil prediction accuracy decreases due to the heterogeneity of the samples. In this paper, we first investigate the global model performance in the LUCAS EU-wide topsoil database. Then, different clustering strategies were tested, including the k-means clustering based on the principal component analysis (PCA) and hierarchical clustering, combined with the partial least squares regression (PLSR) models, and a clustering based on a local PLSR approach. The best validation results were obtained for the local PLSR approach with R2 = 0.75, root mean squared error of prediction (RMSEP) = 13.38 g/kg and ratio of performance to interquartile range (RPIQ) = 2.846, but the algorithm running time was 30.05 s. Similar results were obtained for the k-means clustering method with R2 = 0.75, RMSEP = 14.61 g/kg and RPIQ = 2.844, at only 4.52 s. This study demonstrates that the PLSR approach based on k-means clustering is able to achieve similar prediction accuracy as the local PLSR approach, while significantly improving the algorithm speed. This provides the theoretical basis for adapting the spectral soil model to the needs of real-time SOC quantification.https://swr.agriculturejournals.cz/artkey/swr-202301-0006_prediction-of-the-soil-organic-carbon-in-the-lucas-soil-database-based-on-spectral-clustering.phpcluster analysisregression analysisretrievesoil propertiesvis-nir spectroscopy
spellingShingle Baoyang Liu
Baofeng Guo
Renxiong Zhuo
Fan Dai
Haoyu Chi
Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering
Soil and Water Research
cluster analysis
regression analysis
retrieve
soil properties
vis-nir spectroscopy
title Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering
title_full Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering
title_fullStr Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering
title_full_unstemmed Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering
title_short Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering
title_sort prediction of the soil organic carbon in the lucas soil database based on spectral clustering
topic cluster analysis
regression analysis
retrieve
soil properties
vis-nir spectroscopy
url https://swr.agriculturejournals.cz/artkey/swr-202301-0006_prediction-of-the-soil-organic-carbon-in-the-lucas-soil-database-based-on-spectral-clustering.php
work_keys_str_mv AT baoyangliu predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering
AT baofengguo predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering
AT renxiongzhuo predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering
AT fandai predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering
AT haoyuchi predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering