Streamflow classification by employing various machine learning models for peninsular Malaysia

Abstract Due to excessive streamflow (SF), Peninsular Malaysia has historically experienced floods and droughts. Forecasting streamflow to mitigate municipal and environmental damage is therefore crucial. Streamflow prediction has been extensively demonstrated in the literature to estimate the conti...

Full description

Bibliographic Details
Main Authors: Nouar AlDahoul, Mhd Adel Momo, K. L. Chong, Ali Najah Ahmed, Yuk Feng Huang, Mohsen Sherif, Ahmed El-Shafie
Format: Article
Language:English
Published: Nature Portfolio 2023-09-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-41735-9
_version_ 1797576735425822720
author Nouar AlDahoul
Mhd Adel Momo
K. L. Chong
Ali Najah Ahmed
Yuk Feng Huang
Mohsen Sherif
Ahmed El-Shafie
author_facet Nouar AlDahoul
Mhd Adel Momo
K. L. Chong
Ali Najah Ahmed
Yuk Feng Huang
Mohsen Sherif
Ahmed El-Shafie
author_sort Nouar AlDahoul
collection DOAJ
description Abstract Due to excessive streamflow (SF), Peninsular Malaysia has historically experienced floods and droughts. Forecasting streamflow to mitigate municipal and environmental damage is therefore crucial. Streamflow prediction has been extensively demonstrated in the literature to estimate the continuous values of streamflow level. Prediction of continuous values of streamflow is not necessary in several applications and at the same time it is very challenging task because of uncertainty. A streamflow category prediction is more advantageous for addressing the uncertainty in numerical point forecasting, considering that its predictions are linked to a propensity to belong to the pre-defined classes. Here, we formulate streamflow prediction as a time series classification with discrete ranges of values, each representing a class to classify streamflow into five or ten, respectively, using machine learning approaches in various rivers in Malaysia. The findings reveal that several models, specifically LSTM, outperform others in predicting the following n-time steps of streamflow because LSTM is able to learn the mapping between streamflow time series of 2 or 3 days ahead more than support vector machine (SVM) and gradient boosting (GB). LSTM produces higher F1 score in various rivers (by 5% in Johor, 2% in Kelantan and Melaka and Selangor, 4% in Perlis) in 2 days ahead scenario. Furthermore, the ensemble stacking of the SVM and GB achieves high performance in terms of F1 score and quadratic weighted kappa. Ensemble stacking gives 3% higher F1 score in Perak river compared to SVM and gradient boosting.
first_indexed 2024-03-10T21:57:00Z
format Article
id doaj.art-e7e1bebfbed442bda3a4054d8c5c5833
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-10T21:57:00Z
publishDate 2023-09-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-e7e1bebfbed442bda3a4054d8c5c58332023-11-19T13:07:02ZengNature PortfolioScientific Reports2045-23222023-09-0113112310.1038/s41598-023-41735-9Streamflow classification by employing various machine learning models for peninsular MalaysiaNouar AlDahoul0Mhd Adel Momo1K. L. Chong2Ali Najah Ahmed3Yuk Feng Huang4Mohsen Sherif5Ahmed El-Shafie6Computer Science, New York University Abu DhabiFleet Management Systems & TechnologiesFaculty of Engineering & Quantity Surveying, INTI International University (INTI-IU), Persiaran Perdana BBNDepartment of Civil Engineering, College of Engineering, Universiti Tenaga NasionalDepartment of Civil Engineering, Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul RahmanNational Water and Energy Center, United Arab Emirates UniversityDepartment of Civil Engineering, Faculty of Engineering, University of Malaya (UM)Abstract Due to excessive streamflow (SF), Peninsular Malaysia has historically experienced floods and droughts. Forecasting streamflow to mitigate municipal and environmental damage is therefore crucial. Streamflow prediction has been extensively demonstrated in the literature to estimate the continuous values of streamflow level. Prediction of continuous values of streamflow is not necessary in several applications and at the same time it is very challenging task because of uncertainty. A streamflow category prediction is more advantageous for addressing the uncertainty in numerical point forecasting, considering that its predictions are linked to a propensity to belong to the pre-defined classes. Here, we formulate streamflow prediction as a time series classification with discrete ranges of values, each representing a class to classify streamflow into five or ten, respectively, using machine learning approaches in various rivers in Malaysia. The findings reveal that several models, specifically LSTM, outperform others in predicting the following n-time steps of streamflow because LSTM is able to learn the mapping between streamflow time series of 2 or 3 days ahead more than support vector machine (SVM) and gradient boosting (GB). LSTM produces higher F1 score in various rivers (by 5% in Johor, 2% in Kelantan and Melaka and Selangor, 4% in Perlis) in 2 days ahead scenario. Furthermore, the ensemble stacking of the SVM and GB achieves high performance in terms of F1 score and quadratic weighted kappa. Ensemble stacking gives 3% higher F1 score in Perak river compared to SVM and gradient boosting.https://doi.org/10.1038/s41598-023-41735-9
spellingShingle Nouar AlDahoul
Mhd Adel Momo
K. L. Chong
Ali Najah Ahmed
Yuk Feng Huang
Mohsen Sherif
Ahmed El-Shafie
Streamflow classification by employing various machine learning models for peninsular Malaysia
Scientific Reports
title Streamflow classification by employing various machine learning models for peninsular Malaysia
title_full Streamflow classification by employing various machine learning models for peninsular Malaysia
title_fullStr Streamflow classification by employing various machine learning models for peninsular Malaysia
title_full_unstemmed Streamflow classification by employing various machine learning models for peninsular Malaysia
title_short Streamflow classification by employing various machine learning models for peninsular Malaysia
title_sort streamflow classification by employing various machine learning models for peninsular malaysia
url https://doi.org/10.1038/s41598-023-41735-9
work_keys_str_mv AT nouaraldahoul streamflowclassificationbyemployingvariousmachinelearningmodelsforpeninsularmalaysia
AT mhdadelmomo streamflowclassificationbyemployingvariousmachinelearningmodelsforpeninsularmalaysia
AT klchong streamflowclassificationbyemployingvariousmachinelearningmodelsforpeninsularmalaysia
AT alinajahahmed streamflowclassificationbyemployingvariousmachinelearningmodelsforpeninsularmalaysia
AT yukfenghuang streamflowclassificationbyemployingvariousmachinelearningmodelsforpeninsularmalaysia
AT mohsensherif streamflowclassificationbyemployingvariousmachinelearningmodelsforpeninsularmalaysia
AT ahmedelshafie streamflowclassificationbyemployingvariousmachinelearningmodelsforpeninsularmalaysia