Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria

<p>Accurate predictions of seasonal low flows are critical for a number of water management tasks that require inferences about water quality and the ecological status of water bodies. This paper proposes an extreme gradient tree boosting model (XGBoost) for predicting monthly low flow in unga...

Full description

Bibliographic Details
Main Authors: J. Laimighofer, M. Melcher, G. Laaha
Format: Article
Language:English
Published: Copernicus Publications 2022-09-01
Series:Hydrology and Earth System Sciences
Online Access:https://hess.copernicus.org/articles/26/4553/2022/hess-26-4553-2022.pdf
_version_ 1798000953875496960
author J. Laimighofer
M. Melcher
G. Laaha
author_facet J. Laimighofer
M. Melcher
G. Laaha
author_sort J. Laimighofer
collection DOAJ
description <p>Accurate predictions of seasonal low flows are critical for a number of water management tasks that require inferences about water quality and the ecological status of water bodies. This paper proposes an extreme gradient tree boosting model (XGBoost) for predicting monthly low flow in ungauged catchments. Particular emphasis is placed on the lowest values (in the magnitude of annual low flows and below) by implementing the expectile loss function to the XGBoost model. For this purpose, we test expectile loss functions based on decreasing expectiles (from <span class="inline-formula"><i>τ</i>=0.5</span> to <span class="inline-formula">0.01</span>) that give increasing weight to lower values. These are compared to common loss functions such as mean and median absolute loss. Model optimization and evaluation are conducted using a nested cross-validation (CV) approach that includes recursive feature elimination (RFE) to promote parsimonious models. The methods are tested on a comprehensive dataset of 260 stream gauges in Austria, covering a wide range of low-flow regimes. Our results demonstrate that the expectile loss function can yield high prediction accuracy, but the performance drops sharply for low expectile models. With a median <span class="inline-formula"><i>R</i><sup>2</sup></span> of 0.67, the <span class="inline-formula">0.5</span> expectile yields the best-performing model. The <span class="inline-formula">0.3</span> and <span class="inline-formula">0.2</span> perform slightly worse, but still outperform the common median and mean absolute loss functions. All expectile models include some stations with moderate and poor performance that can be attributed to some systematic error, while the seasonal and annual variability is well covered by the models. Results for the prediction of low extremes show an increasing performance in terms of <span class="inline-formula"><i>R</i><sup>2</sup></span> for smaller expectiles (0.01, 0.025, 0.05), though leading to the disadvantage of classifying too many extremes for each station. We found that the application of different expectiles leads to a trade-off between overall performance, prediction performance for extremes, and misclassification of extreme low-flow events. Our results show that the <span class="inline-formula">0.1</span> or <span class="inline-formula">0.2</span> expectiles perform best with respect to all three criteria. The resulting extreme gradient tree boosting model covers seasonal and annual variability nicely and provides a viable approach for spatiotemporal modeling of a range of hydrological variables representing average conditions and extreme events.</p>
first_indexed 2024-04-11T11:28:26Z
format Article
id doaj.art-b063223796fb47e1963e80e1f8a6ebf5
institution Directory Open Access Journal
issn 1027-5606
1607-7938
language English
last_indexed 2024-04-11T11:28:26Z
publishDate 2022-09-01
publisher Copernicus Publications
record_format Article
series Hydrology and Earth System Sciences
spelling doaj.art-b063223796fb47e1963e80e1f8a6ebf52022-12-22T04:26:12ZengCopernicus PublicationsHydrology and Earth System Sciences1027-56061607-79382022-09-01264553457410.5194/hess-26-4553-2022Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in AustriaJ. Laimighofer0M. Melcher1G. Laaha2Department of Landscape, Spatial and Infrastructure Sciences, Institute of Statistics, University of Natural Resources and Life Sciences, Vienna, Peter-Jordan-Strasse 82/I, 1190 Vienna, AustriaInstitute of Information Management, FH JOANNEUM – University of Applied Sciences, Graz, AustriaDepartment of Landscape, Spatial and Infrastructure Sciences, Institute of Statistics, University of Natural Resources and Life Sciences, Vienna, Peter-Jordan-Strasse 82/I, 1190 Vienna, Austria<p>Accurate predictions of seasonal low flows are critical for a number of water management tasks that require inferences about water quality and the ecological status of water bodies. This paper proposes an extreme gradient tree boosting model (XGBoost) for predicting monthly low flow in ungauged catchments. Particular emphasis is placed on the lowest values (in the magnitude of annual low flows and below) by implementing the expectile loss function to the XGBoost model. For this purpose, we test expectile loss functions based on decreasing expectiles (from <span class="inline-formula"><i>τ</i>=0.5</span> to <span class="inline-formula">0.01</span>) that give increasing weight to lower values. These are compared to common loss functions such as mean and median absolute loss. Model optimization and evaluation are conducted using a nested cross-validation (CV) approach that includes recursive feature elimination (RFE) to promote parsimonious models. The methods are tested on a comprehensive dataset of 260 stream gauges in Austria, covering a wide range of low-flow regimes. Our results demonstrate that the expectile loss function can yield high prediction accuracy, but the performance drops sharply for low expectile models. With a median <span class="inline-formula"><i>R</i><sup>2</sup></span> of 0.67, the <span class="inline-formula">0.5</span> expectile yields the best-performing model. The <span class="inline-formula">0.3</span> and <span class="inline-formula">0.2</span> perform slightly worse, but still outperform the common median and mean absolute loss functions. All expectile models include some stations with moderate and poor performance that can be attributed to some systematic error, while the seasonal and annual variability is well covered by the models. Results for the prediction of low extremes show an increasing performance in terms of <span class="inline-formula"><i>R</i><sup>2</sup></span> for smaller expectiles (0.01, 0.025, 0.05), though leading to the disadvantage of classifying too many extremes for each station. We found that the application of different expectiles leads to a trade-off between overall performance, prediction performance for extremes, and misclassification of extreme low-flow events. Our results show that the <span class="inline-formula">0.1</span> or <span class="inline-formula">0.2</span> expectiles perform best with respect to all three criteria. The resulting extreme gradient tree boosting model covers seasonal and annual variability nicely and provides a viable approach for spatiotemporal modeling of a range of hydrological variables representing average conditions and extreme events.</p>https://hess.copernicus.org/articles/26/4553/2022/hess-26-4553-2022.pdf
spellingShingle J. Laimighofer
M. Melcher
G. Laaha
Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria
Hydrology and Earth System Sciences
title Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria
title_full Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria
title_fullStr Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria
title_full_unstemmed Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria
title_short Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria
title_sort low flow estimation beyond the mean expectile loss and extreme gradient boosting for spatiotemporal low flow prediction in austria
url https://hess.copernicus.org/articles/26/4553/2022/hess-26-4553-2022.pdf
work_keys_str_mv AT jlaimighofer lowflowestimationbeyondthemeanexpectilelossandextremegradientboostingforspatiotemporallowflowpredictioninaustria
AT mmelcher lowflowestimationbeyondthemeanexpectilelossandextremegradientboostingforspatiotemporallowflowpredictioninaustria
AT glaaha lowflowestimationbeyondthemeanexpectilelossandextremegradientboostingforspatiotemporallowflowpredictioninaustria