A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú

Accurate glacier mapping is crucial for assessing future water security in Andean ecosystems. Traditional accuracy assessment may be biased due to overlooking spatial autocorrelation during map validation. In recent years, spatial cross-validation (CV) strategies have been proposed in environmental...

Full description

Bibliographic Details
Main Authors:	Marcelo Bueno, Briggitte Macera, Nilton Montoya
Format:	Article
Language:	English
Published:	MDPI AG 2023-12-01
Series:	Water
Subjects:	spatial modeling machine learning glacier mapping glacier retreat climate change spatial autocorrelation
Online Access:	https://www.mdpi.com/2073-4441/15/24/4214

_version_	1797379076613210112
author	Marcelo Bueno Briggitte Macera Nilton Montoya
author_facet	Marcelo Bueno Briggitte Macera Nilton Montoya
author_sort	Marcelo Bueno
collection	DOAJ
description	Accurate glacier mapping is crucial for assessing future water security in Andean ecosystems. Traditional accuracy assessment may be biased due to overlooking spatial autocorrelation during map validation. In recent years, spatial cross-validation (CV) strategies have been proposed in environmental and ecological modeling to reduce bias in predictive accuracy. In this study, we demonstrate the influence of spatial autocorrelation on the accuracy assessment of glacier surface predictive models. This is achieved by comparing the performance of several widely used machine learning algorithms including the gradient-boosting machines (GBM), k-nearest neighbors (KNN), random forest (RF), and logistic regression (LR) for mapping nine main Peruvian glacier regions. Spatial and non-spatial cross-validation methods were used to evaluate the model’s classification errors in terms of the Matthews correlation coefficient. Performance differences of up to 18% were found between bias-reduced (spatial) and overoptimistic (non-spatial) cross-validation results. Regarding only spatial CV, the k-nearest neighbors were the overall best model across Huallanca (0.90), Huayhuasha (0.78), Huaytapallana (0.96), Raura (0.93), Urubamba (0.96), Vilcabamba (0.93), and Vilcanota (0.92) regions, consistently demonstrating the highest performance followed by logistic regression at Blanca (0.95) and Central (0.97) regions. Our validation approach, accounting for spatial characteristics, provides valuable insights for glacier mapping studies and future efforts on glacier retreat monitoring. Incorporating this approach improves the reliability of glacier mapping, guiding future national-level initiatives.
first_indexed	2024-03-08T20:16:53Z
format	Article
id	doaj.art-33072fb3043a4ec38f6a69bb50c7a36d
institution	Directory Open Access Journal
issn	2073-4441
language	English
last_indexed	2024-03-08T20:16:53Z
publishDate	2023-12-01
publisher	MDPI AG
record_format	Article
series	Water
spelling	doaj.art-33072fb3043a4ec38f6a69bb50c7a36d2023-12-22T14:49:42ZengMDPI AGWater2073-44412023-12-011524421410.3390/w15244214A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in PerúMarcelo Bueno0Briggitte Macera1Nilton Montoya2Departamento Académico de Agricultura, Universidad Nacional de San Antonio Abad del Cusco (UNSAAC), Cusco 08000, PeruDepartamento Académico de Agricultura, Universidad Nacional de San Antonio Abad del Cusco (UNSAAC), Cusco 08000, PeruDepartamento Académico de Agricultura, Universidad Nacional de San Antonio Abad del Cusco (UNSAAC), Cusco 08000, PeruAccurate glacier mapping is crucial for assessing future water security in Andean ecosystems. Traditional accuracy assessment may be biased due to overlooking spatial autocorrelation during map validation. In recent years, spatial cross-validation (CV) strategies have been proposed in environmental and ecological modeling to reduce bias in predictive accuracy. In this study, we demonstrate the influence of spatial autocorrelation on the accuracy assessment of glacier surface predictive models. This is achieved by comparing the performance of several widely used machine learning algorithms including the gradient-boosting machines (GBM), k-nearest neighbors (KNN), random forest (RF), and logistic regression (LR) for mapping nine main Peruvian glacier regions. Spatial and non-spatial cross-validation methods were used to evaluate the model’s classification errors in terms of the Matthews correlation coefficient. Performance differences of up to 18% were found between bias-reduced (spatial) and overoptimistic (non-spatial) cross-validation results. Regarding only spatial CV, the k-nearest neighbors were the overall best model across Huallanca (0.90), Huayhuasha (0.78), Huaytapallana (0.96), Raura (0.93), Urubamba (0.96), Vilcabamba (0.93), and Vilcanota (0.92) regions, consistently demonstrating the highest performance followed by logistic regression at Blanca (0.95) and Central (0.97) regions. Our validation approach, accounting for spatial characteristics, provides valuable insights for glacier mapping studies and future efforts on glacier retreat monitoring. Incorporating this approach improves the reliability of glacier mapping, guiding future national-level initiatives.https://www.mdpi.com/2073-4441/15/24/4214spatial modelingmachine learningglacier mappingglacier retreatclimate changespatial autocorrelation
spellingShingle	Marcelo Bueno Briggitte Macera Nilton Montoya A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú Water spatial modeling machine learning glacier mapping glacier retreat climate change spatial autocorrelation
title	A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú
title_full	A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú
title_fullStr	A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú
title_full_unstemmed	A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú
title_short	A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú
title_sort	comparative analysis of machine learning techniques for national glacier mapping evaluating performance through spatial cross validation in peru
topic	spatial modeling machine learning glacier mapping glacier retreat climate change spatial autocorrelation
url	https://www.mdpi.com/2073-4441/15/24/4214
work_keys_str_mv	AT marcelobueno acomparativeanalysisofmachinelearningtechniquesfornationalglaciermappingevaluatingperformancethroughspatialcrossvalidationinperu AT briggittemacera acomparativeanalysisofmachinelearningtechniquesfornationalglaciermappingevaluatingperformancethroughspatialcrossvalidationinperu AT niltonmontoya acomparativeanalysisofmachinelearningtechniquesfornationalglaciermappingevaluatingperformancethroughspatialcrossvalidationinperu AT marcelobueno comparativeanalysisofmachinelearningtechniquesfornationalglaciermappingevaluatingperformancethroughspatialcrossvalidationinperu AT briggittemacera comparativeanalysisofmachinelearningtechniquesfornationalglaciermappingevaluatingperformancethroughspatialcrossvalidationinperu AT niltonmontoya comparativeanalysisofmachinelearningtechniquesfornationalglaciermappingevaluatingperformancethroughspatialcrossvalidationinperu

A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú

Similar Items