Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering
The estimation of the level of the soil organic carbon (SOC) content plays an important role in assessing the soil health state. Visible and Near Infrared Diffuse Reflectance Spectroscopy (Vis-NIR DRS) is a fast and cheap tool for measuring the SOC. However, when this technology is applied on a larg...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Czech Academy of Agricultural Sciences
2023-02-01
|
Series: | Soil and Water Research |
Subjects: | |
Online Access: | https://swr.agriculturejournals.cz/artkey/swr-202301-0006_prediction-of-the-soil-organic-carbon-in-the-lucas-soil-database-based-on-spectral-clustering.php |
_version_ | 1797897037070467072 |
---|---|
author | Baoyang Liu Baofeng Guo Renxiong Zhuo Fan Dai Haoyu Chi |
author_facet | Baoyang Liu Baofeng Guo Renxiong Zhuo Fan Dai Haoyu Chi |
author_sort | Baoyang Liu |
collection | DOAJ |
description | The estimation of the level of the soil organic carbon (SOC) content plays an important role in assessing the soil health state. Visible and Near Infrared Diffuse Reflectance Spectroscopy (Vis-NIR DRS) is a fast and cheap tool for measuring the SOC. However, when this technology is applied on a larger area, the soil prediction accuracy decreases due to the heterogeneity of the samples. In this paper, we first investigate the global model performance in the LUCAS EU-wide topsoil database. Then, different clustering strategies were tested, including the k-means clustering based on the principal component analysis (PCA) and hierarchical clustering, combined with the partial least squares regression (PLSR) models, and a clustering based on a local PLSR approach. The best validation results were obtained for the local PLSR approach with R2 = 0.75, root mean squared error of prediction (RMSEP) = 13.38 g/kg and ratio of performance to interquartile range (RPIQ) = 2.846, but the algorithm running time was 30.05 s. Similar results were obtained for the k-means clustering method with R2 = 0.75, RMSEP = 14.61 g/kg and RPIQ = 2.844, at only 4.52 s. This study demonstrates that the PLSR approach based on k-means clustering is able to achieve similar prediction accuracy as the local PLSR approach, while significantly improving the algorithm speed. This provides the theoretical basis for adapting the spectral soil model to the needs of real-time SOC quantification. |
first_indexed | 2024-04-10T07:51:12Z |
format | Article |
id | doaj.art-a5379646696044f8a676af8a5fa3175a |
institution | Directory Open Access Journal |
issn | 1801-5395 1805-9384 |
language | English |
last_indexed | 2024-04-10T07:51:12Z |
publishDate | 2023-02-01 |
publisher | Czech Academy of Agricultural Sciences |
record_format | Article |
series | Soil and Water Research |
spelling | doaj.art-a5379646696044f8a676af8a5fa3175a2023-02-23T10:13:36ZengCzech Academy of Agricultural SciencesSoil and Water Research1801-53951805-93842023-02-01181435410.17221/97/2022-SWRswr-202301-0006Prediction of the soil organic carbon in the LUCAS soil database based on spectral clusteringBaoyang Liu0Baofeng Guo1Renxiong Zhuo2Fan Dai3Haoyu Chi4School of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou, P.R. ChinaThe estimation of the level of the soil organic carbon (SOC) content plays an important role in assessing the soil health state. Visible and Near Infrared Diffuse Reflectance Spectroscopy (Vis-NIR DRS) is a fast and cheap tool for measuring the SOC. However, when this technology is applied on a larger area, the soil prediction accuracy decreases due to the heterogeneity of the samples. In this paper, we first investigate the global model performance in the LUCAS EU-wide topsoil database. Then, different clustering strategies were tested, including the k-means clustering based on the principal component analysis (PCA) and hierarchical clustering, combined with the partial least squares regression (PLSR) models, and a clustering based on a local PLSR approach. The best validation results were obtained for the local PLSR approach with R2 = 0.75, root mean squared error of prediction (RMSEP) = 13.38 g/kg and ratio of performance to interquartile range (RPIQ) = 2.846, but the algorithm running time was 30.05 s. Similar results were obtained for the k-means clustering method with R2 = 0.75, RMSEP = 14.61 g/kg and RPIQ = 2.844, at only 4.52 s. This study demonstrates that the PLSR approach based on k-means clustering is able to achieve similar prediction accuracy as the local PLSR approach, while significantly improving the algorithm speed. This provides the theoretical basis for adapting the spectral soil model to the needs of real-time SOC quantification.https://swr.agriculturejournals.cz/artkey/swr-202301-0006_prediction-of-the-soil-organic-carbon-in-the-lucas-soil-database-based-on-spectral-clustering.phpcluster analysisregression analysisretrievesoil propertiesvis-nir spectroscopy |
spellingShingle | Baoyang Liu Baofeng Guo Renxiong Zhuo Fan Dai Haoyu Chi Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering Soil and Water Research cluster analysis regression analysis retrieve soil properties vis-nir spectroscopy |
title | Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering |
title_full | Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering |
title_fullStr | Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering |
title_full_unstemmed | Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering |
title_short | Prediction of the soil organic carbon in the LUCAS soil database based on spectral clustering |
title_sort | prediction of the soil organic carbon in the lucas soil database based on spectral clustering |
topic | cluster analysis regression analysis retrieve soil properties vis-nir spectroscopy |
url | https://swr.agriculturejournals.cz/artkey/swr-202301-0006_prediction-of-the-soil-organic-carbon-in-the-lucas-soil-database-based-on-spectral-clustering.php |
work_keys_str_mv | AT baoyangliu predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering AT baofengguo predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering AT renxiongzhuo predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering AT fandai predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering AT haoyuchi predictionofthesoilorganiccarboninthelucassoildatabasebasedonspectralclustering |