Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity

Cation exchange capacity (CEC) is a soil property that significantly determines nutrient availability and effectiveness of fertilizer applied in lands under different managements. CEC’s accurate and high-resolution spatial information is needed for the sustainability of agricultural management on fa...

Full description

Bibliographic Details
Main Authors: Fuat Kaya, Gaurav Mishra, Rosa Francaviglia, Ali Keshavarzi
Format: Article
Language:English
Published: MDPI AG 2023-04-01
Series:Land
Subjects:
Online Access:https://www.mdpi.com/2073-445X/12/4/819
_version_ 1797604713223421952
author Fuat Kaya
Gaurav Mishra
Rosa Francaviglia
Ali Keshavarzi
author_facet Fuat Kaya
Gaurav Mishra
Rosa Francaviglia
Ali Keshavarzi
author_sort Fuat Kaya
collection DOAJ
description Cation exchange capacity (CEC) is a soil property that significantly determines nutrient availability and effectiveness of fertilizer applied in lands under different managements. CEC’s accurate and high-resolution spatial information is needed for the sustainability of agricultural management on farms in the Nagaland state (northeast India) which are fragmented and intertwined with the forest ecosystem. The current study applied the digital soil mapping (DSM) methodology, based on the CEC values determined in soil samples obtained from 305 points in the region, which is mountainous and difficult to access. Firstly, digital auxiliary data were obtained from three open-access sources, including indices generated from the time series Landsat 8 OLI satellite, topographic variables derived from a digital elevation model (DEM), and the WorldClim dataset. Furthermore, the CEC values and the auxiliary were used data to model Lasso regression (LR), stochastic gradient boosting (GBM), support vector regression (SVR), random forest (RF), and K-nearest neighbors (KNN) machine learning (ML) algorithms were systematically compared in the R-Core Environment Program. Model performance were evaluated with the square root mean error (RMSE), determination coefficient (R<sup>2</sup>), and mean absolute error (MAE) of 10-fold cross-validation (CV). The lowest RMSE was obtained by the RF algorithm with 4.12 cmol<sub>c</sub> kg<sup>−1</sup>, while the others were in the following order: SVR (4.27 cmol<sub>c</sub> kg<sup>−1</sup>) <KNN (4.45 cmol<sub>c</sub> kg<sup>−1</sup>) <LR (4.67 cmol<sub>c</sub> kg<sup>−1</sup>) <GBM (5.07 cmol<sub>c</sub> kg<sup>−1</sup>). In particular, WorldClim-based climate covariates such as annual mean temperature (BIO-1), annual precipitation (BIO-12), elevation, and solar radiation were the most important variables in all algorithms. High uncertainty (SD) values have been found in areas with low soil sampling density and this finding is to be considered in future soil surveys.
first_indexed 2024-03-11T04:50:39Z
format Article
id doaj.art-0ab68fae380c489b8609915ef952ddce
institution Directory Open Access Journal
issn 2073-445X
language English
last_indexed 2024-03-11T04:50:39Z
publishDate 2023-04-01
publisher MDPI AG
record_format Article
series Land
spelling doaj.art-0ab68fae380c489b8609915ef952ddce2023-11-17T20:02:40ZengMDPI AGLand2073-445X2023-04-0112481910.3390/land12040819Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange CapacityFuat Kaya0Gaurav Mishra1Rosa Francaviglia2Ali Keshavarzi3Department of Soil Science and Plant Nutrition, Faculty of Agriculture, Isparta University of Applied Sciences, Isparta 32260, TürkiyeCentre of Excellence on Sustainable Land Management, Indian Council of Forestry Research and Education, Dehradun 248006, Uttarakhand, IndiaResearch Centre for Agriculture and Environment, Council for Agricultural Research and Economics, 00184 Rome, ItalyLaboratory of Remote Sensing and GIS, Department of Soil Science, University of Tehran, P.O. Box 4111, Karaj 31587-77871, IranCation exchange capacity (CEC) is a soil property that significantly determines nutrient availability and effectiveness of fertilizer applied in lands under different managements. CEC’s accurate and high-resolution spatial information is needed for the sustainability of agricultural management on farms in the Nagaland state (northeast India) which are fragmented and intertwined with the forest ecosystem. The current study applied the digital soil mapping (DSM) methodology, based on the CEC values determined in soil samples obtained from 305 points in the region, which is mountainous and difficult to access. Firstly, digital auxiliary data were obtained from three open-access sources, including indices generated from the time series Landsat 8 OLI satellite, topographic variables derived from a digital elevation model (DEM), and the WorldClim dataset. Furthermore, the CEC values and the auxiliary were used data to model Lasso regression (LR), stochastic gradient boosting (GBM), support vector regression (SVR), random forest (RF), and K-nearest neighbors (KNN) machine learning (ML) algorithms were systematically compared in the R-Core Environment Program. Model performance were evaluated with the square root mean error (RMSE), determination coefficient (R<sup>2</sup>), and mean absolute error (MAE) of 10-fold cross-validation (CV). The lowest RMSE was obtained by the RF algorithm with 4.12 cmol<sub>c</sub> kg<sup>−1</sup>, while the others were in the following order: SVR (4.27 cmol<sub>c</sub> kg<sup>−1</sup>) <KNN (4.45 cmol<sub>c</sub> kg<sup>−1</sup>) <LR (4.67 cmol<sub>c</sub> kg<sup>−1</sup>) <GBM (5.07 cmol<sub>c</sub> kg<sup>−1</sup>). In particular, WorldClim-based climate covariates such as annual mean temperature (BIO-1), annual precipitation (BIO-12), elevation, and solar radiation were the most important variables in all algorithms. High uncertainty (SD) values have been found in areas with low soil sampling density and this finding is to be considered in future soil surveys.https://www.mdpi.com/2073-445X/12/4/819digital soil mappingsoil cation exchange capacityfeature selectionuncertaintymountainous regiongeomorphology
spellingShingle Fuat Kaya
Gaurav Mishra
Rosa Francaviglia
Ali Keshavarzi
Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity
Land
digital soil mapping
soil cation exchange capacity
feature selection
uncertainty
mountainous region
geomorphology
title Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity
title_full Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity
title_fullStr Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity
title_full_unstemmed Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity
title_short Combining Digital Covariates and Machine Learning Models to Predict the Spatial Variation of Soil Cation Exchange Capacity
title_sort combining digital covariates and machine learning models to predict the spatial variation of soil cation exchange capacity
topic digital soil mapping
soil cation exchange capacity
feature selection
uncertainty
mountainous region
geomorphology
url https://www.mdpi.com/2073-445X/12/4/819
work_keys_str_mv AT fuatkaya combiningdigitalcovariatesandmachinelearningmodelstopredictthespatialvariationofsoilcationexchangecapacity
AT gauravmishra combiningdigitalcovariatesandmachinelearningmodelstopredictthespatialvariationofsoilcationexchangecapacity
AT rosafrancaviglia combiningdigitalcovariatesandmachinelearningmodelstopredictthespatialvariationofsoilcationexchangecapacity
AT alikeshavarzi combiningdigitalcovariatesandmachinelearningmodelstopredictthespatialvariationofsoilcationexchangecapacity