KNN vs. Bluecat—Machine Learning vs. Classical Statistics
Uncertainty is inherent in the modelling of any physical processes. Regarding hydrological modelling, the uncertainty has multiple sources including the measurement errors of the stresses (the model inputs), the measurement errors of the hydrological process of interest (the observations against whi...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-06-01
|
Series: | Hydrology |
Subjects: | |
Online Access: | https://www.mdpi.com/2306-5338/9/6/101 |
_version_ | 1797486770918522880 |
---|---|
author | Evangelos Rozos Demetris Koutsoyiannis Alberto Montanari |
author_facet | Evangelos Rozos Demetris Koutsoyiannis Alberto Montanari |
author_sort | Evangelos Rozos |
collection | DOAJ |
description | Uncertainty is inherent in the modelling of any physical processes. Regarding hydrological modelling, the uncertainty has multiple sources including the measurement errors of the stresses (the model inputs), the measurement errors of the hydrological process of interest (the observations against which the model is calibrated), the model limitations, etc. The typical techniques to assess this uncertainty (e.g., Monte Carlo simulation) are computationally expensive and require specific preparations for each individual application (e.g., selection of appropriate probability distribution). Recently, data-driven methods have been suggested that attempt to estimate the uncertainty of a model simulation based exclusively on the available data. In this study, two data-driven methods were employed, one based on machine learning techniques, and one based on statistical approaches. These methods were tested in two real-world case studies to obtain conclusions regarding their reliability. Furthermore, the flexibility of the machine learning method allowed assessing more complex sampling schemes for the data-driven estimation of the uncertainty. The anatomisation of the algorithmic background of the two methods revealed similarities between them, with the background of the statistical method being more theoretically robust. Nevertheless, the results from the case studies indicated that both methods perform equivalently well. For this reason, data-driven methods can become a valuable tool for practitioners. |
first_indexed | 2024-03-09T23:38:00Z |
format | Article |
id | doaj.art-77ea5be6bfc445878076777b765556fb |
institution | Directory Open Access Journal |
issn | 2306-5338 |
language | English |
last_indexed | 2024-03-09T23:38:00Z |
publishDate | 2022-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Hydrology |
spelling | doaj.art-77ea5be6bfc445878076777b765556fb2023-11-23T16:56:34ZengMDPI AGHydrology2306-53382022-06-019610110.3390/hydrology9060101KNN vs. Bluecat—Machine Learning vs. Classical StatisticsEvangelos Rozos0Demetris Koutsoyiannis1Alberto Montanari2Institute for Environmental Research & Sustainable Development, National Observatory of Athens, 15236 Athens, GreeceDepartment of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, 15780 Athens, GreeceDepartment of Civil, Chemical, Environmental and Materials Engineering (DICAM), University of Bologna, 40136 Bologna, ItalyUncertainty is inherent in the modelling of any physical processes. Regarding hydrological modelling, the uncertainty has multiple sources including the measurement errors of the stresses (the model inputs), the measurement errors of the hydrological process of interest (the observations against which the model is calibrated), the model limitations, etc. The typical techniques to assess this uncertainty (e.g., Monte Carlo simulation) are computationally expensive and require specific preparations for each individual application (e.g., selection of appropriate probability distribution). Recently, data-driven methods have been suggested that attempt to estimate the uncertainty of a model simulation based exclusively on the available data. In this study, two data-driven methods were employed, one based on machine learning techniques, and one based on statistical approaches. These methods were tested in two real-world case studies to obtain conclusions regarding their reliability. Furthermore, the flexibility of the machine learning method allowed assessing more complex sampling schemes for the data-driven estimation of the uncertainty. The anatomisation of the algorithmic background of the two methods revealed similarities between them, with the background of the statistical method being more theoretically robust. Nevertheless, the results from the case studies indicated that both methods perform equivalently well. For this reason, data-driven methods can become a valuable tool for practitioners.https://www.mdpi.com/2306-5338/9/6/101k-nearest neighboursdata-driven modellingmodel uncertaintymachine learningstatistical analysishydrological modelling |
spellingShingle | Evangelos Rozos Demetris Koutsoyiannis Alberto Montanari KNN vs. Bluecat—Machine Learning vs. Classical Statistics Hydrology k-nearest neighbours data-driven modelling model uncertainty machine learning statistical analysis hydrological modelling |
title | KNN vs. Bluecat—Machine Learning vs. Classical Statistics |
title_full | KNN vs. Bluecat—Machine Learning vs. Classical Statistics |
title_fullStr | KNN vs. Bluecat—Machine Learning vs. Classical Statistics |
title_full_unstemmed | KNN vs. Bluecat—Machine Learning vs. Classical Statistics |
title_short | KNN vs. Bluecat—Machine Learning vs. Classical Statistics |
title_sort | knn vs bluecat machine learning vs classical statistics |
topic | k-nearest neighbours data-driven modelling model uncertainty machine learning statistical analysis hydrological modelling |
url | https://www.mdpi.com/2306-5338/9/6/101 |
work_keys_str_mv | AT evangelosrozos knnvsbluecatmachinelearningvsclassicalstatistics AT demetriskoutsoyiannis knnvsbluecatmachinelearningvsclassicalstatistics AT albertomontanari knnvsbluecatmachinelearningvsclassicalstatistics |