KNN vs. Bluecat—Machine Learning vs. Classical Statistics

Uncertainty is inherent in the modelling of any physical processes. Regarding hydrological modelling, the uncertainty has multiple sources including the measurement errors of the stresses (the model inputs), the measurement errors of the hydrological process of interest (the observations against whi...

Full description

Bibliographic Details
Main Authors:	Evangelos Rozos, Demetris Koutsoyiannis, Alberto Montanari
Format:	Article
Language:	English
Published:	MDPI AG 2022-06-01
Series:	Hydrology
Subjects:	k-nearest neighbours data-driven modelling model uncertainty machine learning statistical analysis hydrological modelling
Online Access:	https://www.mdpi.com/2306-5338/9/6/101

_version_	1797486770918522880
author	Evangelos Rozos Demetris Koutsoyiannis Alberto Montanari
author_facet	Evangelos Rozos Demetris Koutsoyiannis Alberto Montanari
author_sort	Evangelos Rozos
collection	DOAJ
description	Uncertainty is inherent in the modelling of any physical processes. Regarding hydrological modelling, the uncertainty has multiple sources including the measurement errors of the stresses (the model inputs), the measurement errors of the hydrological process of interest (the observations against which the model is calibrated), the model limitations, etc. The typical techniques to assess this uncertainty (e.g., Monte Carlo simulation) are computationally expensive and require specific preparations for each individual application (e.g., selection of appropriate probability distribution). Recently, data-driven methods have been suggested that attempt to estimate the uncertainty of a model simulation based exclusively on the available data. In this study, two data-driven methods were employed, one based on machine learning techniques, and one based on statistical approaches. These methods were tested in two real-world case studies to obtain conclusions regarding their reliability. Furthermore, the flexibility of the machine learning method allowed assessing more complex sampling schemes for the data-driven estimation of the uncertainty. The anatomisation of the algorithmic background of the two methods revealed similarities between them, with the background of the statistical method being more theoretically robust. Nevertheless, the results from the case studies indicated that both methods perform equivalently well. For this reason, data-driven methods can become a valuable tool for practitioners.
first_indexed	2024-03-09T23:38:00Z
format	Article
id	doaj.art-77ea5be6bfc445878076777b765556fb
institution	Directory Open Access Journal
issn	2306-5338
language	English
last_indexed	2024-03-09T23:38:00Z
publishDate	2022-06-01
publisher	MDPI AG
record_format	Article
series	Hydrology
spelling	doaj.art-77ea5be6bfc445878076777b765556fb2023-11-23T16:56:34ZengMDPI AGHydrology2306-53382022-06-019610110.3390/hydrology9060101KNN vs. Bluecat—Machine Learning vs. Classical StatisticsEvangelos Rozos0Demetris Koutsoyiannis1Alberto Montanari2Institute for Environmental Research & Sustainable Development, National Observatory of Athens, 15236 Athens, GreeceDepartment of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, 15780 Athens, GreeceDepartment of Civil, Chemical, Environmental and Materials Engineering (DICAM), University of Bologna, 40136 Bologna, ItalyUncertainty is inherent in the modelling of any physical processes. Regarding hydrological modelling, the uncertainty has multiple sources including the measurement errors of the stresses (the model inputs), the measurement errors of the hydrological process of interest (the observations against which the model is calibrated), the model limitations, etc. The typical techniques to assess this uncertainty (e.g., Monte Carlo simulation) are computationally expensive and require specific preparations for each individual application (e.g., selection of appropriate probability distribution). Recently, data-driven methods have been suggested that attempt to estimate the uncertainty of a model simulation based exclusively on the available data. In this study, two data-driven methods were employed, one based on machine learning techniques, and one based on statistical approaches. These methods were tested in two real-world case studies to obtain conclusions regarding their reliability. Furthermore, the flexibility of the machine learning method allowed assessing more complex sampling schemes for the data-driven estimation of the uncertainty. The anatomisation of the algorithmic background of the two methods revealed similarities between them, with the background of the statistical method being more theoretically robust. Nevertheless, the results from the case studies indicated that both methods perform equivalently well. For this reason, data-driven methods can become a valuable tool for practitioners.https://www.mdpi.com/2306-5338/9/6/101k-nearest neighboursdata-driven modellingmodel uncertaintymachine learningstatistical analysishydrological modelling
spellingShingle	Evangelos Rozos Demetris Koutsoyiannis Alberto Montanari KNN vs. Bluecat—Machine Learning vs. Classical Statistics Hydrology k-nearest neighbours data-driven modelling model uncertainty machine learning statistical analysis hydrological modelling
title	KNN vs. Bluecat—Machine Learning vs. Classical Statistics
title_full	KNN vs. Bluecat—Machine Learning vs. Classical Statistics
title_fullStr	KNN vs. Bluecat—Machine Learning vs. Classical Statistics
title_full_unstemmed	KNN vs. Bluecat—Machine Learning vs. Classical Statistics
title_short	KNN vs. Bluecat—Machine Learning vs. Classical Statistics
title_sort	knn vs bluecat machine learning vs classical statistics
topic	k-nearest neighbours data-driven modelling model uncertainty machine learning statistical analysis hydrological modelling
url	https://www.mdpi.com/2306-5338/9/6/101
work_keys_str_mv	AT evangelosrozos knnvsbluecatmachinelearningvsclassicalstatistics AT demetriskoutsoyiannis knnvsbluecatmachinelearningvsclassicalstatistics AT albertomontanari knnvsbluecatmachinelearningvsclassicalstatistics

KNN vs. Bluecat—Machine Learning vs. Classical Statistics

Similar Items