Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models

Runoff from the high-cold mountains area (HCMA) is the most important water resource in the arid zone, and its accurate forecasting is key to the scientific management of water resources downstream of the basin. Constrained by the scarcity of meteorological and hydrological stations in the HCMA and...

Full description

Bibliographic Details
Main Authors: Shuyang Wang, Meiping Sun, Guoyu Wang, Xiaojun Yao, Meng Wang, Jiawei Li, Hongyu Duan, Zhenyu Xie, Ruiyi Fan, Yang Yang
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Water
Subjects:
Online Access:https://www.mdpi.com/2073-4441/15/18/3222
_version_ 1797576316568993792
author Shuyang Wang
Meiping Sun
Guoyu Wang
Xiaojun Yao
Meng Wang
Jiawei Li
Hongyu Duan
Zhenyu Xie
Ruiyi Fan
Yang Yang
author_facet Shuyang Wang
Meiping Sun
Guoyu Wang
Xiaojun Yao
Meng Wang
Jiawei Li
Hongyu Duan
Zhenyu Xie
Ruiyi Fan
Yang Yang
author_sort Shuyang Wang
collection DOAJ
description Runoff from the high-cold mountains area (HCMA) is the most important water resource in the arid zone, and its accurate forecasting is key to the scientific management of water resources downstream of the basin. Constrained by the scarcity of meteorological and hydrological stations in the HCMA and the inconsistency of the observed time series, the simulation and reconstruction of mountain runoff have always been a focus of cold region hydrological research. Based on the runoff observations of the Yurungkash and Kalakash Rivers, the upstream tributaries of the Hotan River on the northern slope of the Kunlun Mountains at different time periods, and the meteorological and atmospheric circulation indices, we used feature analysis and machine learning methods to select the input elements, train, simulate, and select the preferences of the machine learning models of the runoffs of the two watersheds, and reconstruct the missing time series runoff of the Kalakash River. The results show the following. (1) Air temperature is the most important driver of runoff variability in mountainous areas upstream of the Hotan River, and had the strongest performance in terms of the Pearson correlation coefficient (ρ<sub>XY</sub>) and random forest feature importance (FI) (ρ<sub>XY</sub> = 0.63, FI = 0.723), followed by soil temperature (ρ<sub>XY</sub> = 0.63, FI = 0.043), precipitation, hours of sunshine, wind speed, relative humidity, and atmospheric circulation were weakly correlated. A total of 12 elements were selected as the machine learning input data. (2) Comparing the results of the Yurungkash River runoff simulated by eight machine learning methods, we found that the gradient boosting and random forest methods performed best, followed by the AdaBoost and Bagging methods, with Nash–Sutcliffe efficiency coefficients (NSE) of 0.84, 0.82, 0.78, and 0.78, while the support vector regression (NSE = 0.68), ridge (NSE = 0.53), K-nearest neighbor (NSE = 0.56), and linear regression (NSE = 0.51) were simulated poorly. (3) The application of four machine learning methods, gradient boosting, random forest, AdaBoost, and bagging, to simulate the runoff of the Kalakash River for 1978–1998 was generally outstanding, with the NSE exceeding 0.75, and the results of reconstructing the runoff data for the missing period (1999–2019) could well reflect the characteristics of the intra-annual and inter-annual changes in runoff.
first_indexed 2024-03-10T21:51:27Z
format Article
id doaj.art-5dee1365acb541f88bde200e3570e1a5
institution Directory Open Access Journal
issn 2073-4441
language English
last_indexed 2024-03-10T21:51:27Z
publishDate 2023-09-01
publisher MDPI AG
record_format Article
series Water
spelling doaj.art-5dee1365acb541f88bde200e3570e1a52023-11-19T13:25:32ZengMDPI AGWater2073-44412023-09-011518322210.3390/w15183222Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning ModelsShuyang Wang0Meiping Sun1Guoyu Wang2Xiaojun Yao3Meng Wang4Jiawei Li5Hongyu Duan6Zhenyu Xie7Ruiyi Fan8Yang Yang9Department of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaChemistry and Chemical Engineering, Chongqing University of Science and Technology, Chongqing 401331, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaRunoff from the high-cold mountains area (HCMA) is the most important water resource in the arid zone, and its accurate forecasting is key to the scientific management of water resources downstream of the basin. Constrained by the scarcity of meteorological and hydrological stations in the HCMA and the inconsistency of the observed time series, the simulation and reconstruction of mountain runoff have always been a focus of cold region hydrological research. Based on the runoff observations of the Yurungkash and Kalakash Rivers, the upstream tributaries of the Hotan River on the northern slope of the Kunlun Mountains at different time periods, and the meteorological and atmospheric circulation indices, we used feature analysis and machine learning methods to select the input elements, train, simulate, and select the preferences of the machine learning models of the runoffs of the two watersheds, and reconstruct the missing time series runoff of the Kalakash River. The results show the following. (1) Air temperature is the most important driver of runoff variability in mountainous areas upstream of the Hotan River, and had the strongest performance in terms of the Pearson correlation coefficient (ρ<sub>XY</sub>) and random forest feature importance (FI) (ρ<sub>XY</sub> = 0.63, FI = 0.723), followed by soil temperature (ρ<sub>XY</sub> = 0.63, FI = 0.043), precipitation, hours of sunshine, wind speed, relative humidity, and atmospheric circulation were weakly correlated. A total of 12 elements were selected as the machine learning input data. (2) Comparing the results of the Yurungkash River runoff simulated by eight machine learning methods, we found that the gradient boosting and random forest methods performed best, followed by the AdaBoost and Bagging methods, with Nash–Sutcliffe efficiency coefficients (NSE) of 0.84, 0.82, 0.78, and 0.78, while the support vector regression (NSE = 0.68), ridge (NSE = 0.53), K-nearest neighbor (NSE = 0.56), and linear regression (NSE = 0.51) were simulated poorly. (3) The application of four machine learning methods, gradient boosting, random forest, AdaBoost, and bagging, to simulate the runoff of the Kalakash River for 1978–1998 was generally outstanding, with the NSE exceeding 0.75, and the results of reconstructing the runoff data for the missing period (1999–2019) could well reflect the characteristics of the intra-annual and inter-annual changes in runoff.https://www.mdpi.com/2073-4441/15/18/3222feature analysisHotan River Basinhigh-cold mountains areamachine learningrunoff simulation and reconstruction
spellingShingle Shuyang Wang
Meiping Sun
Guoyu Wang
Xiaojun Yao
Meng Wang
Jiawei Li
Hongyu Duan
Zhenyu Xie
Ruiyi Fan
Yang Yang
Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models
Water
feature analysis
Hotan River Basin
high-cold mountains area
machine learning
runoff simulation and reconstruction
title Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models
title_full Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models
title_fullStr Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models
title_full_unstemmed Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models
title_short Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models
title_sort simulation and reconstruction of runoff in the high cold mountains area based on multiple machine learning models
topic feature analysis
Hotan River Basin
high-cold mountains area
machine learning
runoff simulation and reconstruction
url https://www.mdpi.com/2073-4441/15/18/3222
work_keys_str_mv AT shuyangwang simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT meipingsun simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT guoyuwang simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT xiaojunyao simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT mengwang simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT jiaweili simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT hongyuduan simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT zhenyuxie simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT ruiyifan simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels
AT yangyang simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels