Spatial modelling of topsoil properties in Romania using geostatistical methods and machine learning

Various research topics from the field of soil science or agriculture require digital maps of soil properties as input data. Such maps can be achieved by digital soil mapping (DSM) techniques which have developed consistently during the last decades. Our research focuses on the application of geosta...

Full description

Bibliographic Details
Main Authors: Cristian Valeriu Patriche, Bogdan Roşca, Radu Gabriel Pîrnău, Ionuţ Vasiliniuc
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10446225/?tool=EBI
_version_ 1797736815099117568
author Cristian Valeriu Patriche
Bogdan Roşca
Radu Gabriel Pîrnău
Ionuţ Vasiliniuc
author_facet Cristian Valeriu Patriche
Bogdan Roşca
Radu Gabriel Pîrnău
Ionuţ Vasiliniuc
author_sort Cristian Valeriu Patriche
collection DOAJ
description Various research topics from the field of soil science or agriculture require digital maps of soil properties as input data. Such maps can be achieved by digital soil mapping (DSM) techniques which have developed consistently during the last decades. Our research focuses on the application of geostatistical methods (including ordinary kriging, regression-kriging and geographically weighted regression) and machine learning algorithms to produce high resolution digital maps of topsoil properties in Romania. Six continuous predictors were considered in our study (digital elevation model, topographic wetness index, normalized difference vegetation index, slope, latitude and longitude). A tolerance test was performed to ensure that all predictors can be used for the purpose of digital soil mapping. The input soil data was extracted from the LUCAS database and includes 7 chemical properties (pH, electrical conductivity, calcium carbonate, organic carbon, N, P, K) and the particle-size fractions (sand, silt, clay). The spatial autocorrelation is higher for pH, organic carbon and calcium carbonate, as indicated by the partial sill / nugget ratio of semivariograms, meaning that these properties are more predictable than the others by kriging interpolation. The optimal DSM method was selected by independent sample validation, using resampled statistics from 100 samples randomly extracted from the validation dataset. Also, an additional independent sample of soil profiles, comprising legacy soil data, and the 200k Romania soil map were used for a supplementary validation. The results show that machine learning and regression-kriging are the optimal methods in most cases. Among the machine learning tested algorithms, the best performance is associated with Support Vector Machines and Random Forests methods. The geographically weighted regression is also among the optimum methods for pH and calcium carbonates spatial prediction. Good predictions were achieved for pH (R2 of 0.417–0.469, depending on the method), organic carbon (R2 of 0.302–0.443), calcium carbonates (R2 of 0.300–0.330) and moderate predictions for electric conductivity, total nitrogen, silt and sand (R2 of 0.155–0.331), while the lowest prediction characterizes the phosphorous content (R2 of 0.015–0.044). LUCAS proved to be a reliable and useful soil database and the achieved spatial distributions of soil properties can be further used for national and regional soil studies.
first_indexed 2024-03-12T13:19:23Z
format Article
id doaj.art-62dfdd4fc380404ca0ee1bdd4489e30d
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-03-12T13:19:23Z
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-62dfdd4fc380404ca0ee1bdd4489e30d2023-08-26T05:31:34ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01188Spatial modelling of topsoil properties in Romania using geostatistical methods and machine learningCristian Valeriu PatricheBogdan RoşcaRadu Gabriel PîrnăuIonuţ VasiliniucVarious research topics from the field of soil science or agriculture require digital maps of soil properties as input data. Such maps can be achieved by digital soil mapping (DSM) techniques which have developed consistently during the last decades. Our research focuses on the application of geostatistical methods (including ordinary kriging, regression-kriging and geographically weighted regression) and machine learning algorithms to produce high resolution digital maps of topsoil properties in Romania. Six continuous predictors were considered in our study (digital elevation model, topographic wetness index, normalized difference vegetation index, slope, latitude and longitude). A tolerance test was performed to ensure that all predictors can be used for the purpose of digital soil mapping. The input soil data was extracted from the LUCAS database and includes 7 chemical properties (pH, electrical conductivity, calcium carbonate, organic carbon, N, P, K) and the particle-size fractions (sand, silt, clay). The spatial autocorrelation is higher for pH, organic carbon and calcium carbonate, as indicated by the partial sill / nugget ratio of semivariograms, meaning that these properties are more predictable than the others by kriging interpolation. The optimal DSM method was selected by independent sample validation, using resampled statistics from 100 samples randomly extracted from the validation dataset. Also, an additional independent sample of soil profiles, comprising legacy soil data, and the 200k Romania soil map were used for a supplementary validation. The results show that machine learning and regression-kriging are the optimal methods in most cases. Among the machine learning tested algorithms, the best performance is associated with Support Vector Machines and Random Forests methods. The geographically weighted regression is also among the optimum methods for pH and calcium carbonates spatial prediction. Good predictions were achieved for pH (R2 of 0.417–0.469, depending on the method), organic carbon (R2 of 0.302–0.443), calcium carbonates (R2 of 0.300–0.330) and moderate predictions for electric conductivity, total nitrogen, silt and sand (R2 of 0.155–0.331), while the lowest prediction characterizes the phosphorous content (R2 of 0.015–0.044). LUCAS proved to be a reliable and useful soil database and the achieved spatial distributions of soil properties can be further used for national and regional soil studies.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10446225/?tool=EBI
spellingShingle Cristian Valeriu Patriche
Bogdan Roşca
Radu Gabriel Pîrnău
Ionuţ Vasiliniuc
Spatial modelling of topsoil properties in Romania using geostatistical methods and machine learning
PLoS ONE
title Spatial modelling of topsoil properties in Romania using geostatistical methods and machine learning
title_full Spatial modelling of topsoil properties in Romania using geostatistical methods and machine learning
title_fullStr Spatial modelling of topsoil properties in Romania using geostatistical methods and machine learning
title_full_unstemmed Spatial modelling of topsoil properties in Romania using geostatistical methods and machine learning
title_short Spatial modelling of topsoil properties in Romania using geostatistical methods and machine learning
title_sort spatial modelling of topsoil properties in romania using geostatistical methods and machine learning
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10446225/?tool=EBI
work_keys_str_mv AT cristianvaleriupatriche spatialmodellingoftopsoilpropertiesinromaniausinggeostatisticalmethodsandmachinelearning
AT bogdanrosca spatialmodellingoftopsoilpropertiesinromaniausinggeostatisticalmethodsandmachinelearning
AT radugabrielpirnau spatialmodellingoftopsoilpropertiesinromaniausinggeostatisticalmethodsandmachinelearning
AT ionutvasiliniuc spatialmodellingoftopsoilpropertiesinromaniausinggeostatisticalmethodsandmachinelearning