Machine-learning methods for hydrological imputation data: analysis of the goodness of fit of the model in hydrographic systems of the Pacific - Ecuador

Computational methods based on machine learning have had extensive development and application in hydrology, especially for modelling systems that do not have enough data. Within this problem, there are data series that are missing, and that should not necessarily be discarded; this is achieved by m...

Full description

Bibliographic Details
Main Authors: Diego Heras, Carlos Matovelle
Format: Article
Language:English
Published: Instituto de Pesquisas Ambientais em Bacias Hidrográficas (IPABHi) 2021-06-01
Series:Revista Ambiente & Água
Subjects:
Online Access:https://www.scielo.br/j/ambiagua/a/m3nQgWQLmhHqPghwMKHtnNP/?lang=en
_version_ 1818881304852168704
author Diego Heras
Carlos Matovelle
author_facet Diego Heras
Carlos Matovelle
author_sort Diego Heras
collection DOAJ
description Computational methods based on machine learning have had extensive development and application in hydrology, especially for modelling systems that do not have enough data. Within this problem, there are data series that are missing, and that should not necessarily be discarded; this is achieved by means of the imputation of the same ones, obtaining complete sets. For this reason, this research proposes a comparison of computer-learning techniques to identify those best suited for hydrographic systems of the Pacific of Ecuador. For the elaboration of this investigation, the hydro-meteorological records of the monitoring stations located in the watersheds of the Esmeraldas, Cañar and Jubones Rivers were used for 22 years, between 1990 and 2012. The variables that were imputed were precipitation and flow. Automatic learning machines of the Python Scikit_Learn module were used; these modules integrate a wide range of automated learning algorithms, such as Linear Regression and Random Forest. Finally, results were obtained that led to a minimum useful mean square error for Random Forest as an automatic machine-learning imputation method that best fits the systems and data analyzed.
first_indexed 2024-12-19T14:59:45Z
format Article
id doaj.art-e5623b8ac7e34b8f90bd2525165de699
institution Directory Open Access Journal
issn 1980-993X
language English
last_indexed 2024-12-19T14:59:45Z
publishDate 2021-06-01
publisher Instituto de Pesquisas Ambientais em Bacias Hidrográficas (IPABHi)
record_format Article
series Revista Ambiente & Água
spelling doaj.art-e5623b8ac7e34b8f90bd2525165de6992022-12-21T20:16:37ZengInstituto de Pesquisas Ambientais em Bacias Hidrográficas (IPABHi)Revista Ambiente & Água1980-993X2021-06-0116311210.4136/ambi-agua.2708Machine-learning methods for hydrological imputation data: analysis of the goodness of fit of the model in hydrographic systems of the Pacific - EcuadorDiego Heras0https://orcid.org/0000-0002-8729-0981Carlos Matovelle1https://orcid.org/0000-0003-2267-0323Center for Research, Innovation and technology transfer. Environmental Engineering. Catholic University of Cuenca, Avenida de las Americas, EC 010101, Cuenca, Azuay, Ecuador. Center for Research, Innovation and technology transfer. Environmental Engineering. Catholic University of Cuenca, Avenida de las Americas, EC 010101, Cuenca, Azuay, Ecuador. Computational methods based on machine learning have had extensive development and application in hydrology, especially for modelling systems that do not have enough data. Within this problem, there are data series that are missing, and that should not necessarily be discarded; this is achieved by means of the imputation of the same ones, obtaining complete sets. For this reason, this research proposes a comparison of computer-learning techniques to identify those best suited for hydrographic systems of the Pacific of Ecuador. For the elaboration of this investigation, the hydro-meteorological records of the monitoring stations located in the watersheds of the Esmeraldas, Cañar and Jubones Rivers were used for 22 years, between 1990 and 2012. The variables that were imputed were precipitation and flow. Automatic learning machines of the Python Scikit_Learn module were used; these modules integrate a wide range of automated learning algorithms, such as Linear Regression and Random Forest. Finally, results were obtained that led to a minimum useful mean square error for Random Forest as an automatic machine-learning imputation method that best fits the systems and data analyzed.https://www.scielo.br/j/ambiagua/a/m3nQgWQLmhHqPghwMKHtnNP/?lang=endata imputationhydrographic systemsmachine learning
spellingShingle Diego Heras
Carlos Matovelle
Machine-learning methods for hydrological imputation data: analysis of the goodness of fit of the model in hydrographic systems of the Pacific - Ecuador
Revista Ambiente & Água
data imputation
hydrographic systems
machine learning
title Machine-learning methods for hydrological imputation data: analysis of the goodness of fit of the model in hydrographic systems of the Pacific - Ecuador
title_full Machine-learning methods for hydrological imputation data: analysis of the goodness of fit of the model in hydrographic systems of the Pacific - Ecuador
title_fullStr Machine-learning methods for hydrological imputation data: analysis of the goodness of fit of the model in hydrographic systems of the Pacific - Ecuador
title_full_unstemmed Machine-learning methods for hydrological imputation data: analysis of the goodness of fit of the model in hydrographic systems of the Pacific - Ecuador
title_short Machine-learning methods for hydrological imputation data: analysis of the goodness of fit of the model in hydrographic systems of the Pacific - Ecuador
title_sort machine learning methods for hydrological imputation data analysis of the goodness of fit of the model in hydrographic systems of the pacific ecuador
topic data imputation
hydrographic systems
machine learning
url https://www.scielo.br/j/ambiagua/a/m3nQgWQLmhHqPghwMKHtnNP/?lang=en
work_keys_str_mv AT diegoheras machinelearningmethodsforhydrologicalimputationdataanalysisofthegoodnessoffitofthemodelinhydrographicsystemsofthepacificecuador
AT carlosmatovelle machinelearningmethodsforhydrologicalimputationdataanalysisofthegoodnessoffitofthemodelinhydrographicsystemsofthepacificecuador