Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models
Runoff from the high-cold mountains area (HCMA) is the most important water resource in the arid zone, and its accurate forecasting is key to the scientific management of water resources downstream of the basin. Constrained by the scarcity of meteorological and hydrological stations in the HCMA and...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-09-01
|
Series: | Water |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4441/15/18/3222 |
_version_ | 1797576316568993792 |
---|---|
author | Shuyang Wang Meiping Sun Guoyu Wang Xiaojun Yao Meng Wang Jiawei Li Hongyu Duan Zhenyu Xie Ruiyi Fan Yang Yang |
author_facet | Shuyang Wang Meiping Sun Guoyu Wang Xiaojun Yao Meng Wang Jiawei Li Hongyu Duan Zhenyu Xie Ruiyi Fan Yang Yang |
author_sort | Shuyang Wang |
collection | DOAJ |
description | Runoff from the high-cold mountains area (HCMA) is the most important water resource in the arid zone, and its accurate forecasting is key to the scientific management of water resources downstream of the basin. Constrained by the scarcity of meteorological and hydrological stations in the HCMA and the inconsistency of the observed time series, the simulation and reconstruction of mountain runoff have always been a focus of cold region hydrological research. Based on the runoff observations of the Yurungkash and Kalakash Rivers, the upstream tributaries of the Hotan River on the northern slope of the Kunlun Mountains at different time periods, and the meteorological and atmospheric circulation indices, we used feature analysis and machine learning methods to select the input elements, train, simulate, and select the preferences of the machine learning models of the runoffs of the two watersheds, and reconstruct the missing time series runoff of the Kalakash River. The results show the following. (1) Air temperature is the most important driver of runoff variability in mountainous areas upstream of the Hotan River, and had the strongest performance in terms of the Pearson correlation coefficient (ρ<sub>XY</sub>) and random forest feature importance (FI) (ρ<sub>XY</sub> = 0.63, FI = 0.723), followed by soil temperature (ρ<sub>XY</sub> = 0.63, FI = 0.043), precipitation, hours of sunshine, wind speed, relative humidity, and atmospheric circulation were weakly correlated. A total of 12 elements were selected as the machine learning input data. (2) Comparing the results of the Yurungkash River runoff simulated by eight machine learning methods, we found that the gradient boosting and random forest methods performed best, followed by the AdaBoost and Bagging methods, with Nash–Sutcliffe efficiency coefficients (NSE) of 0.84, 0.82, 0.78, and 0.78, while the support vector regression (NSE = 0.68), ridge (NSE = 0.53), K-nearest neighbor (NSE = 0.56), and linear regression (NSE = 0.51) were simulated poorly. (3) The application of four machine learning methods, gradient boosting, random forest, AdaBoost, and bagging, to simulate the runoff of the Kalakash River for 1978–1998 was generally outstanding, with the NSE exceeding 0.75, and the results of reconstructing the runoff data for the missing period (1999–2019) could well reflect the characteristics of the intra-annual and inter-annual changes in runoff. |
first_indexed | 2024-03-10T21:51:27Z |
format | Article |
id | doaj.art-5dee1365acb541f88bde200e3570e1a5 |
institution | Directory Open Access Journal |
issn | 2073-4441 |
language | English |
last_indexed | 2024-03-10T21:51:27Z |
publishDate | 2023-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Water |
spelling | doaj.art-5dee1365acb541f88bde200e3570e1a52023-11-19T13:25:32ZengMDPI AGWater2073-44412023-09-011518322210.3390/w15183222Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning ModelsShuyang Wang0Meiping Sun1Guoyu Wang2Xiaojun Yao3Meng Wang4Jiawei Li5Hongyu Duan6Zhenyu Xie7Ruiyi Fan8Yang Yang9Department of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaChemistry and Chemical Engineering, Chongqing University of Science and Technology, Chongqing 401331, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaDepartment of Geography and Environment Sciences, Northwest Normal University, Lanzhou 730070, ChinaRunoff from the high-cold mountains area (HCMA) is the most important water resource in the arid zone, and its accurate forecasting is key to the scientific management of water resources downstream of the basin. Constrained by the scarcity of meteorological and hydrological stations in the HCMA and the inconsistency of the observed time series, the simulation and reconstruction of mountain runoff have always been a focus of cold region hydrological research. Based on the runoff observations of the Yurungkash and Kalakash Rivers, the upstream tributaries of the Hotan River on the northern slope of the Kunlun Mountains at different time periods, and the meteorological and atmospheric circulation indices, we used feature analysis and machine learning methods to select the input elements, train, simulate, and select the preferences of the machine learning models of the runoffs of the two watersheds, and reconstruct the missing time series runoff of the Kalakash River. The results show the following. (1) Air temperature is the most important driver of runoff variability in mountainous areas upstream of the Hotan River, and had the strongest performance in terms of the Pearson correlation coefficient (ρ<sub>XY</sub>) and random forest feature importance (FI) (ρ<sub>XY</sub> = 0.63, FI = 0.723), followed by soil temperature (ρ<sub>XY</sub> = 0.63, FI = 0.043), precipitation, hours of sunshine, wind speed, relative humidity, and atmospheric circulation were weakly correlated. A total of 12 elements were selected as the machine learning input data. (2) Comparing the results of the Yurungkash River runoff simulated by eight machine learning methods, we found that the gradient boosting and random forest methods performed best, followed by the AdaBoost and Bagging methods, with Nash–Sutcliffe efficiency coefficients (NSE) of 0.84, 0.82, 0.78, and 0.78, while the support vector regression (NSE = 0.68), ridge (NSE = 0.53), K-nearest neighbor (NSE = 0.56), and linear regression (NSE = 0.51) were simulated poorly. (3) The application of four machine learning methods, gradient boosting, random forest, AdaBoost, and bagging, to simulate the runoff of the Kalakash River for 1978–1998 was generally outstanding, with the NSE exceeding 0.75, and the results of reconstructing the runoff data for the missing period (1999–2019) could well reflect the characteristics of the intra-annual and inter-annual changes in runoff.https://www.mdpi.com/2073-4441/15/18/3222feature analysisHotan River Basinhigh-cold mountains areamachine learningrunoff simulation and reconstruction |
spellingShingle | Shuyang Wang Meiping Sun Guoyu Wang Xiaojun Yao Meng Wang Jiawei Li Hongyu Duan Zhenyu Xie Ruiyi Fan Yang Yang Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models Water feature analysis Hotan River Basin high-cold mountains area machine learning runoff simulation and reconstruction |
title | Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models |
title_full | Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models |
title_fullStr | Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models |
title_full_unstemmed | Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models |
title_short | Simulation and Reconstruction of Runoff in the High-Cold Mountains Area Based on Multiple Machine Learning Models |
title_sort | simulation and reconstruction of runoff in the high cold mountains area based on multiple machine learning models |
topic | feature analysis Hotan River Basin high-cold mountains area machine learning runoff simulation and reconstruction |
url | https://www.mdpi.com/2073-4441/15/18/3222 |
work_keys_str_mv | AT shuyangwang simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT meipingsun simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT guoyuwang simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT xiaojunyao simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT mengwang simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT jiaweili simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT hongyuduan simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT zhenyuxie simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT ruiyifan simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels AT yangyang simulationandreconstructionofrunoffinthehighcoldmountainsareabasedonmultiplemachinelearningmodels |