Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria
<p>Accurate predictions of seasonal low flows are critical for a number of water management tasks that require inferences about water quality and the ecological status of water bodies. This paper proposes an extreme gradient tree boosting model (XGBoost) for predicting monthly low flow in unga...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2022-09-01
|
Series: | Hydrology and Earth System Sciences |
Online Access: | https://hess.copernicus.org/articles/26/4553/2022/hess-26-4553-2022.pdf |
_version_ | 1798000953875496960 |
---|---|
author | J. Laimighofer M. Melcher G. Laaha |
author_facet | J. Laimighofer M. Melcher G. Laaha |
author_sort | J. Laimighofer |
collection | DOAJ |
description | <p>Accurate predictions of seasonal low flows are critical for a number of water management tasks that require inferences about water quality and the ecological status of water bodies. This paper proposes an extreme gradient tree boosting model (XGBoost) for predicting monthly low flow in ungauged catchments. Particular emphasis is placed on the lowest values (in the magnitude of annual low flows and below) by implementing the expectile loss function to the XGBoost model. For this purpose, we test expectile loss functions based on decreasing expectiles (from <span class="inline-formula"><i>τ</i>=0.5</span> to <span class="inline-formula">0.01</span>) that give increasing weight to lower values. These are compared to common loss functions such as mean and median absolute loss. Model optimization and evaluation are conducted using a nested cross-validation (CV) approach that includes recursive feature elimination (RFE) to promote parsimonious models. The methods are tested on a comprehensive dataset of 260 stream gauges in Austria, covering a wide range of low-flow regimes. Our results demonstrate that the expectile loss function can yield high prediction accuracy, but the performance drops sharply for low expectile models. With a median <span class="inline-formula"><i>R</i><sup>2</sup></span> of 0.67, the <span class="inline-formula">0.5</span> expectile yields the best-performing model. The <span class="inline-formula">0.3</span> and <span class="inline-formula">0.2</span> perform slightly worse, but still outperform the common median and mean absolute loss functions. All expectile models include some stations with moderate and poor performance that can be attributed to some systematic error, while the seasonal and annual variability is well covered by the models. Results for the prediction of low extremes show an increasing performance in terms of <span class="inline-formula"><i>R</i><sup>2</sup></span> for smaller expectiles (0.01, 0.025, 0.05), though leading to the disadvantage of classifying too many extremes for each station. We found that the application of different expectiles leads to a trade-off between overall performance, prediction performance for extremes, and misclassification of extreme low-flow events. Our results show that the <span class="inline-formula">0.1</span> or <span class="inline-formula">0.2</span> expectiles perform best with respect to all three criteria. The resulting extreme gradient tree boosting model covers seasonal and annual variability nicely and provides a viable approach for spatiotemporal modeling of a range of hydrological variables representing average conditions and extreme events.</p> |
first_indexed | 2024-04-11T11:28:26Z |
format | Article |
id | doaj.art-b063223796fb47e1963e80e1f8a6ebf5 |
institution | Directory Open Access Journal |
issn | 1027-5606 1607-7938 |
language | English |
last_indexed | 2024-04-11T11:28:26Z |
publishDate | 2022-09-01 |
publisher | Copernicus Publications |
record_format | Article |
series | Hydrology and Earth System Sciences |
spelling | doaj.art-b063223796fb47e1963e80e1f8a6ebf52022-12-22T04:26:12ZengCopernicus PublicationsHydrology and Earth System Sciences1027-56061607-79382022-09-01264553457410.5194/hess-26-4553-2022Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in AustriaJ. Laimighofer0M. Melcher1G. Laaha2Department of Landscape, Spatial and Infrastructure Sciences, Institute of Statistics, University of Natural Resources and Life Sciences, Vienna, Peter-Jordan-Strasse 82/I, 1190 Vienna, AustriaInstitute of Information Management, FH JOANNEUM – University of Applied Sciences, Graz, AustriaDepartment of Landscape, Spatial and Infrastructure Sciences, Institute of Statistics, University of Natural Resources and Life Sciences, Vienna, Peter-Jordan-Strasse 82/I, 1190 Vienna, Austria<p>Accurate predictions of seasonal low flows are critical for a number of water management tasks that require inferences about water quality and the ecological status of water bodies. This paper proposes an extreme gradient tree boosting model (XGBoost) for predicting monthly low flow in ungauged catchments. Particular emphasis is placed on the lowest values (in the magnitude of annual low flows and below) by implementing the expectile loss function to the XGBoost model. For this purpose, we test expectile loss functions based on decreasing expectiles (from <span class="inline-formula"><i>τ</i>=0.5</span> to <span class="inline-formula">0.01</span>) that give increasing weight to lower values. These are compared to common loss functions such as mean and median absolute loss. Model optimization and evaluation are conducted using a nested cross-validation (CV) approach that includes recursive feature elimination (RFE) to promote parsimonious models. The methods are tested on a comprehensive dataset of 260 stream gauges in Austria, covering a wide range of low-flow regimes. Our results demonstrate that the expectile loss function can yield high prediction accuracy, but the performance drops sharply for low expectile models. With a median <span class="inline-formula"><i>R</i><sup>2</sup></span> of 0.67, the <span class="inline-formula">0.5</span> expectile yields the best-performing model. The <span class="inline-formula">0.3</span> and <span class="inline-formula">0.2</span> perform slightly worse, but still outperform the common median and mean absolute loss functions. All expectile models include some stations with moderate and poor performance that can be attributed to some systematic error, while the seasonal and annual variability is well covered by the models. Results for the prediction of low extremes show an increasing performance in terms of <span class="inline-formula"><i>R</i><sup>2</sup></span> for smaller expectiles (0.01, 0.025, 0.05), though leading to the disadvantage of classifying too many extremes for each station. We found that the application of different expectiles leads to a trade-off between overall performance, prediction performance for extremes, and misclassification of extreme low-flow events. Our results show that the <span class="inline-formula">0.1</span> or <span class="inline-formula">0.2</span> expectiles perform best with respect to all three criteria. The resulting extreme gradient tree boosting model covers seasonal and annual variability nicely and provides a viable approach for spatiotemporal modeling of a range of hydrological variables representing average conditions and extreme events.</p>https://hess.copernicus.org/articles/26/4553/2022/hess-26-4553-2022.pdf |
spellingShingle | J. Laimighofer M. Melcher G. Laaha Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria Hydrology and Earth System Sciences |
title | Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria |
title_full | Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria |
title_fullStr | Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria |
title_full_unstemmed | Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria |
title_short | Low-flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria |
title_sort | low flow estimation beyond the mean expectile loss and extreme gradient boosting for spatiotemporal low flow prediction in austria |
url | https://hess.copernicus.org/articles/26/4553/2022/hess-26-4553-2022.pdf |
work_keys_str_mv | AT jlaimighofer lowflowestimationbeyondthemeanexpectilelossandextremegradientboostingforspatiotemporallowflowpredictioninaustria AT mmelcher lowflowestimationbeyondthemeanexpectilelossandextremegradientboostingforspatiotemporallowflowpredictioninaustria AT glaaha lowflowestimationbeyondthemeanexpectilelossandextremegradientboostingforspatiotemporallowflowpredictioninaustria |