ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data

Streamflow prediction plays a vital role in water resources planning in order to understand the dramatic change of climatic and hydrologic variables over different time scales. In this study, we used machine learning (ML)-based prediction models, including Random Forest Regression (RFR), Long Short-...

Full description

Bibliographic Details
Main Authors: Pouya Hosseinzadeh, Ayman Nassar, Soukaina Filali Boubrahimi, Shah Muhammad Hamdi
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Hydrology
Subjects:
Online Access:https://www.mdpi.com/2306-5338/10/2/29
_version_ 1797620694725427200
author Pouya Hosseinzadeh
Ayman Nassar
Soukaina Filali Boubrahimi
Shah Muhammad Hamdi
author_facet Pouya Hosseinzadeh
Ayman Nassar
Soukaina Filali Boubrahimi
Shah Muhammad Hamdi
author_sort Pouya Hosseinzadeh
collection DOAJ
description Streamflow prediction plays a vital role in water resources planning in order to understand the dramatic change of climatic and hydrologic variables over different time scales. In this study, we used machine learning (ML)-based prediction models, including Random Forest Regression (RFR), Long Short-Term Memory (LSTM), Seasonal Auto- Regressive Integrated Moving Average (SARIMA), and Facebook Prophet (PROPHET) to predict 24 months ahead of natural streamflow at the Lees Ferry site located at the bottom part of the Upper Colorado River Basin (UCRB) of the US. Firstly, we used only historic streamflow data to predict 24 months ahead. Secondly, we considered meteorological components such as temperature and precipitation as additional features. We tested the models on a monthly test dataset spanning 6 years, where 24-month predictions were repeated 50 times to ensure the consistency of the results. Moreover, we performed a sensitivity analysis to identify our best-performing model. Later, we analyzed the effects of considering different span window sizes on the quality of predictions made by our best model. Finally, we applied our best-performing model, RFR, on two more rivers in different states in the UCRB to test the model’s generalizability. We evaluated the performance of the predictive models using multiple evaluation measures. The predictions in multivariate time-series models were found to be more accurate, with RMSE less than 0.84 mm per month, R-squared more than 0.8, and MAPE less than 0.25. Therefore, we conclude that the temperature and precipitation of the UCRB increases the accuracy of the predictions. Ultimately, we found that multivariate RFR performs the best among four models and is generalizable to other rivers in the UCRB.
first_indexed 2024-03-11T08:45:20Z
format Article
id doaj.art-c53810b21af24c11af50e59fa2995d9e
institution Directory Open Access Journal
issn 2306-5338
language English
last_indexed 2024-03-11T08:45:20Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series Hydrology
spelling doaj.art-c53810b21af24c11af50e59fa2995d9e2023-11-16T20:51:32ZengMDPI AGHydrology2306-53382023-01-011022910.3390/hydrology10020029ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series DataPouya Hosseinzadeh0Ayman Nassar1Soukaina Filali Boubrahimi2Shah Muhammad Hamdi3Department of Computer Science, Utah State University, Logan, UT 84322, USADepartment of Civil and Environmental Engineering, Utah State University, Logan, UT 84322, USADepartment of Computer Science, Utah State University, Logan, UT 84322, USADepartment of Computer Science, Utah State University, Logan, UT 84322, USAStreamflow prediction plays a vital role in water resources planning in order to understand the dramatic change of climatic and hydrologic variables over different time scales. In this study, we used machine learning (ML)-based prediction models, including Random Forest Regression (RFR), Long Short-Term Memory (LSTM), Seasonal Auto- Regressive Integrated Moving Average (SARIMA), and Facebook Prophet (PROPHET) to predict 24 months ahead of natural streamflow at the Lees Ferry site located at the bottom part of the Upper Colorado River Basin (UCRB) of the US. Firstly, we used only historic streamflow data to predict 24 months ahead. Secondly, we considered meteorological components such as temperature and precipitation as additional features. We tested the models on a monthly test dataset spanning 6 years, where 24-month predictions were repeated 50 times to ensure the consistency of the results. Moreover, we performed a sensitivity analysis to identify our best-performing model. Later, we analyzed the effects of considering different span window sizes on the quality of predictions made by our best model. Finally, we applied our best-performing model, RFR, on two more rivers in different states in the UCRB to test the model’s generalizability. We evaluated the performance of the predictive models using multiple evaluation measures. The predictions in multivariate time-series models were found to be more accurate, with RMSE less than 0.84 mm per month, R-squared more than 0.8, and MAPE less than 0.25. Therefore, we conclude that the temperature and precipitation of the UCRB increases the accuracy of the predictions. Ultimately, we found that multivariate RFR performs the best among four models and is generalizable to other rivers in the UCRB.https://www.mdpi.com/2306-5338/10/2/29streamflow predictionmachine learningtime series regressionupper colorado river basin
spellingShingle Pouya Hosseinzadeh
Ayman Nassar
Soukaina Filali Boubrahimi
Shah Muhammad Hamdi
ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data
Hydrology
streamflow prediction
machine learning
time series regression
upper colorado river basin
title ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data
title_full ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data
title_fullStr ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data
title_full_unstemmed ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data
title_short ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data
title_sort ml based streamflow prediction in the upper colorado river basin using climate variables time series data
topic streamflow prediction
machine learning
time series regression
upper colorado river basin
url https://www.mdpi.com/2306-5338/10/2/29
work_keys_str_mv AT pouyahosseinzadeh mlbasedstreamflowpredictionintheuppercoloradoriverbasinusingclimatevariablestimeseriesdata
AT aymannassar mlbasedstreamflowpredictionintheuppercoloradoriverbasinusingclimatevariablestimeseriesdata
AT soukainafilaliboubrahimi mlbasedstreamflowpredictionintheuppercoloradoriverbasinusingclimatevariablestimeseriesdata
AT shahmuhammadhamdi mlbasedstreamflowpredictionintheuppercoloradoriverbasinusingclimatevariablestimeseriesdata