Low-Flow (7-Day, 10-Year) Classical Statistical and Improved Machine Learning Estimation Methodologies

Water resource managers require accurate estimates of the 7-day, 10-year low flow (7Q10) of streams for many reasons, including protecting aquatic species, designing wastewater treatment plants, and calculating municipal water availability. StreamStats, a publicly available web application developed...

Full description

Bibliographic Details
Main Authors: Andrew DelSanto, Md Abul Ehsan Bhuiyan, Konstantinos M. Andreadis, Richard N. Palmer
Format: Article
Language:English
Published: MDPI AG 2023-08-01
Series:Water
Subjects:
Online Access:https://www.mdpi.com/2073-4441/15/15/2813
_version_ 1797585934641790976
author Andrew DelSanto
Md Abul Ehsan Bhuiyan
Konstantinos M. Andreadis
Richard N. Palmer
author_facet Andrew DelSanto
Md Abul Ehsan Bhuiyan
Konstantinos M. Andreadis
Richard N. Palmer
author_sort Andrew DelSanto
collection DOAJ
description Water resource managers require accurate estimates of the 7-day, 10-year low flow (7Q10) of streams for many reasons, including protecting aquatic species, designing wastewater treatment plants, and calculating municipal water availability. StreamStats, a publicly available web application developed by the United States Geologic Survey that is commonly used by resource managers for estimating the 7Q10 in states where it is available, utilizes state-by-state, locally calibrated regression equations for estimation. This paper expands StreamStats’ methodology and improves 7Q10 estimation by developing a more regionally applicable and generalized methodology for 7Q10 estimation. In addition to classical methodologies, namely multiple linear regression (MLR) and multiple linear regression in log space (LTLR), three promising machine learning algorithms, random forest (RF) decision trees, neural networks (NN), and generalized additive models (GAM), are tested to determine if more advanced statistical methods offer improved estimation. For illustrative purposes, this methodology is applied to and verified for the full range of unimpaired, gaged basins in both the northeast and mid-Atlantic hydrologic regions of the United States (with basin sizes ranging from 2–1419 mi<sup>2</sup>) using leave-one-out cross-validation (LOOCV). Pearson’s correlation coefficient (R<sup>2</sup>), root mean square error (RMSE), Kling–Gupta Efficiency (KGE), and Nash–Sutcliffe Efficiency (NSE) are used to evaluate the performance of each method. Results suggest that each method provides varying results based on basin size, with RF displaying the smallest average RMSE (5.85) across all ranges of basin sizes.
first_indexed 2024-03-11T00:13:41Z
format Article
id doaj.art-931f2678ecae41a3bc6aef4227b0360a
institution Directory Open Access Journal
issn 2073-4441
language English
last_indexed 2024-03-11T00:13:41Z
publishDate 2023-08-01
publisher MDPI AG
record_format Article
series Water
spelling doaj.art-931f2678ecae41a3bc6aef4227b0360a2023-11-18T23:48:06ZengMDPI AGWater2073-44412023-08-011515281310.3390/w15152813Low-Flow (7-Day, 10-Year) Classical Statistical and Improved Machine Learning Estimation MethodologiesAndrew DelSanto0Md Abul Ehsan Bhuiyan1Konstantinos M. Andreadis2Richard N. Palmer3Department of Civil and Environmental Engineering, University of Massachusetts, Amherst, MA 01003, USADepartment of Civil and Environmental Engineering, University of Massachusetts, Amherst, MA 01003, USADepartment of Civil and Environmental Engineering, University of Massachusetts, Amherst, MA 01003, USADepartment of Civil and Environmental Engineering, University of Massachusetts, Amherst, MA 01003, USAWater resource managers require accurate estimates of the 7-day, 10-year low flow (7Q10) of streams for many reasons, including protecting aquatic species, designing wastewater treatment plants, and calculating municipal water availability. StreamStats, a publicly available web application developed by the United States Geologic Survey that is commonly used by resource managers for estimating the 7Q10 in states where it is available, utilizes state-by-state, locally calibrated regression equations for estimation. This paper expands StreamStats’ methodology and improves 7Q10 estimation by developing a more regionally applicable and generalized methodology for 7Q10 estimation. In addition to classical methodologies, namely multiple linear regression (MLR) and multiple linear regression in log space (LTLR), three promising machine learning algorithms, random forest (RF) decision trees, neural networks (NN), and generalized additive models (GAM), are tested to determine if more advanced statistical methods offer improved estimation. For illustrative purposes, this methodology is applied to and verified for the full range of unimpaired, gaged basins in both the northeast and mid-Atlantic hydrologic regions of the United States (with basin sizes ranging from 2–1419 mi<sup>2</sup>) using leave-one-out cross-validation (LOOCV). Pearson’s correlation coefficient (R<sup>2</sup>), root mean square error (RMSE), Kling–Gupta Efficiency (KGE), and Nash–Sutcliffe Efficiency (NSE) are used to evaluate the performance of each method. Results suggest that each method provides varying results based on basin size, with RF displaying the smallest average RMSE (5.85) across all ranges of basin sizes.https://www.mdpi.com/2073-4441/15/15/2813machine learningstatistical methodshydrologyextreme hydrologic eventslong-term forecasting
spellingShingle Andrew DelSanto
Md Abul Ehsan Bhuiyan
Konstantinos M. Andreadis
Richard N. Palmer
Low-Flow (7-Day, 10-Year) Classical Statistical and Improved Machine Learning Estimation Methodologies
Water
machine learning
statistical methods
hydrology
extreme hydrologic events
long-term forecasting
title Low-Flow (7-Day, 10-Year) Classical Statistical and Improved Machine Learning Estimation Methodologies
title_full Low-Flow (7-Day, 10-Year) Classical Statistical and Improved Machine Learning Estimation Methodologies
title_fullStr Low-Flow (7-Day, 10-Year) Classical Statistical and Improved Machine Learning Estimation Methodologies
title_full_unstemmed Low-Flow (7-Day, 10-Year) Classical Statistical and Improved Machine Learning Estimation Methodologies
title_short Low-Flow (7-Day, 10-Year) Classical Statistical and Improved Machine Learning Estimation Methodologies
title_sort low flow 7 day 10 year classical statistical and improved machine learning estimation methodologies
topic machine learning
statistical methods
hydrology
extreme hydrologic events
long-term forecasting
url https://www.mdpi.com/2073-4441/15/15/2813
work_keys_str_mv AT andrewdelsanto lowflow7day10yearclassicalstatisticalandimprovedmachinelearningestimationmethodologies
AT mdabulehsanbhuiyan lowflow7day10yearclassicalstatisticalandimprovedmachinelearningestimationmethodologies
AT konstantinosmandreadis lowflow7day10yearclassicalstatisticalandimprovedmachinelearningestimationmethodologies
AT richardnpalmer lowflow7day10yearclassicalstatisticalandimprovedmachinelearningestimationmethodologies