Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity
Cation exchange capacity (CEC) is a soil property that significantly determines nutrient availability and effectiveness of fertilizer applied in lands under different managements. CEC’s accurate and high-resolution spatial information is needed for the sustainability of agricultural management on fa...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-04-01
|
Series: | Land |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-445X/12/4/819 |
_version_ | 1797604713223421952 |
---|---|
author | Fuat Kaya Gaurav Mishra Rosa Francaviglia Ali Keshavarzi |
author_facet | Fuat Kaya Gaurav Mishra Rosa Francaviglia Ali Keshavarzi |
author_sort | Fuat Kaya |
collection | DOAJ |
description | Cation exchange capacity (CEC) is a soil property that significantly determines nutrient availability and effectiveness of fertilizer applied in lands under different managements. CEC’s accurate and high-resolution spatial information is needed for the sustainability of agricultural management on farms in the Nagaland state (northeast India) which are fragmented and intertwined with the forest ecosystem. The current study applied the digital soil mapping (DSM) methodology, based on the CEC values determined in soil samples obtained from 305 points in the region, which is mountainous and difficult to access. Firstly, digital auxiliary data were obtained from three open-access sources, including indices generated from the time series Landsat 8 OLI satellite, topographic variables derived from a digital elevation model (DEM), and the WorldClim dataset. Furthermore, the CEC values and the auxiliary were used data to model Lasso regression (LR), stochastic gradient boosting (GBM), support vector regression (SVR), random forest (RF), and K-nearest neighbors (KNN) machine learning (ML) algorithms were systematically compared in the R-Core Environment Program. Model performance were evaluated with the square root mean error (RMSE), determination coefficient (R<sup>2</sup>), and mean absolute error (MAE) of 10-fold cross-validation (CV). The lowest RMSE was obtained by the RF algorithm with 4.12 cmol<sub>c</sub> kg<sup>−1</sup>, while the others were in the following order: SVR (4.27 cmol<sub>c</sub> kg<sup>−1</sup>) <KNN (4.45 cmol<sub>c</sub> kg<sup>−1</sup>) <LR (4.67 cmol<sub>c</sub> kg<sup>−1</sup>) <GBM (5.07 cmol<sub>c</sub> kg<sup>−1</sup>). In particular, WorldClim-based climate covariates such as annual mean temperature (BIO-1), annual precipitation (BIO-12), elevation, and solar radiation were the most important variables in all algorithms. High uncertainty (SD) values have been found in areas with low soil sampling density and this finding is to be considered in future soil surveys. |
first_indexed | 2024-03-11T04:50:39Z |
format | Article |
id | doaj.art-0ab68fae380c489b8609915ef952ddce |
institution | Directory Open Access Journal |
issn | 2073-445X |
language | English |
last_indexed | 2024-03-11T04:50:39Z |
publishDate | 2023-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Land |
spelling | doaj.art-0ab68fae380c489b8609915ef952ddce2023-11-17T20:02:40ZengMDPI AGLand2073-445X2023-04-0112481910.3390/land12040819Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange CapacityFuat Kaya0Gaurav Mishra1Rosa Francaviglia2Ali Keshavarzi3Department of Soil Science and Plant Nutrition, Faculty of Agriculture, Isparta University of Applied Sciences, Isparta 32260, TürkiyeCentre of Excellence on Sustainable Land Management, Indian Council of Forestry Research and Education, Dehradun 248006, Uttarakhand, IndiaResearch Centre for Agriculture and Environment, Council for Agricultural Research and Economics, 00184 Rome, ItalyLaboratory of Remote Sensing and GIS, Department of Soil Science, University of Tehran, P.O. Box 4111, Karaj 31587-77871, IranCation exchange capacity (CEC) is a soil property that significantly determines nutrient availability and effectiveness of fertilizer applied in lands under different managements. CEC’s accurate and high-resolution spatial information is needed for the sustainability of agricultural management on farms in the Nagaland state (northeast India) which are fragmented and intertwined with the forest ecosystem. The current study applied the digital soil mapping (DSM) methodology, based on the CEC values determined in soil samples obtained from 305 points in the region, which is mountainous and difficult to access. Firstly, digital auxiliary data were obtained from three open-access sources, including indices generated from the time series Landsat 8 OLI satellite, topographic variables derived from a digital elevation model (DEM), and the WorldClim dataset. Furthermore, the CEC values and the auxiliary were used data to model Lasso regression (LR), stochastic gradient boosting (GBM), support vector regression (SVR), random forest (RF), and K-nearest neighbors (KNN) machine learning (ML) algorithms were systematically compared in the R-Core Environment Program. Model performance were evaluated with the square root mean error (RMSE), determination coefficient (R<sup>2</sup>), and mean absolute error (MAE) of 10-fold cross-validation (CV). The lowest RMSE was obtained by the RF algorithm with 4.12 cmol<sub>c</sub> kg<sup>−1</sup>, while the others were in the following order: SVR (4.27 cmol<sub>c</sub> kg<sup>−1</sup>) <KNN (4.45 cmol<sub>c</sub> kg<sup>−1</sup>) <LR (4.67 cmol<sub>c</sub> kg<sup>−1</sup>) <GBM (5.07 cmol<sub>c</sub> kg<sup>−1</sup>). In particular, WorldClim-based climate covariates such as annual mean temperature (BIO-1), annual precipitation (BIO-12), elevation, and solar radiation were the most important variables in all algorithms. High uncertainty (SD) values have been found in areas with low soil sampling density and this finding is to be considered in future soil surveys.https://www.mdpi.com/2073-445X/12/4/819digital soil mappingsoil cation exchange capacityfeature selectionuncertaintymountainous regiongeomorphology |
spellingShingle | Fuat Kaya Gaurav Mishra Rosa Francaviglia Ali Keshavarzi Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity Land digital soil mapping soil cation exchange capacity feature selection uncertainty mountainous region geomorphology |
title | Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity |
title_full | Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity |
title_fullStr | Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity |
title_full_unstemmed | Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity |
title_short | Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity |
title_sort | combining digital covariates and machine learning models to predict the spatial variation of soil cation exchange capacity |
topic | digital soil mapping soil cation exchange capacity feature selection uncertainty mountainous region geomorphology |
url | https://www.mdpi.com/2073-445X/12/4/819 |
work_keys_str_mv | AT fuatkaya combiningdigitalcovariatesandmachinelearningmodelstopredictthespatialvariationofsoilcationexchangecapacity AT gauravmishra combiningdigitalcovariatesandmachinelearningmodelstopredictthespatialvariationofsoilcationexchangecapacity AT rosafrancaviglia combiningdigitalcovariatesandmachinelearningmodelstopredictthespatialvariationofsoilcationexchangecapacity AT alikeshavarzi combiningdigitalcovariatesandmachinelearningmodelstopredictthespatialvariationofsoilcationexchangecapacity |