NWP-based lightning prediction using flexible count data regression

<p>A method to predict lightning by postprocessing numerical weather prediction (NWP) output is developed for the region of the European Eastern Alps. Cloud-to-ground (CG) flashes – detected by the ground-based Austrian Lightning Detection &amp; Information System (ALDIS) network – are cou...

Full description

Bibliographic Details
Main Authors: T. Simon, G. J. Mayr, N. Umlauf, A. Zeileis
Format: Article
Language:English
Published: Copernicus Publications 2019-02-01
Series:Advances in Statistical Climatology, Meteorology and Oceanography
Online Access:https://www.adv-stat-clim-meteorol-oceanogr.net/5/1/2019/ascmo-5-1-2019.pdf
Description
Summary:<p>A method to predict lightning by postprocessing numerical weather prediction (NWP) output is developed for the region of the European Eastern Alps. Cloud-to-ground (CG) flashes – detected by the ground-based Austrian Lightning Detection &amp; Information System (ALDIS) network – are counted on the <span class="inline-formula">18×18</span>&thinsp;km<span class="inline-formula"><sup>2</sup></span> grid of the 51-member NWP ensemble of the European Centre for Medium-Range Weather Forecasts (ECMWF). These counts serve as the target quantity in count data regression models for the occurrence of lightning events and flash counts of CG. The probability of lightning occurrence is modelled by a Bernoulli distribution. The flash counts are modelled with a hurdle approach where the Bernoulli distribution is combined with a zero-truncated negative binomial. In the statistical models the parameters of the distributions are described by additive predictors, which are assembled using potentially nonlinear functions of NWP covariates. Measures of location and spread of 100 direct and derived NWP covariates provide a pool of candidates for the nonlinear terms. A combination of stability selection and gradient boosting identifies the nine (three) most influential terms for the parameters of the Bernoulli (zero-truncated negative binomial) distribution, most of which turn out to be associated with either convective available potential energy (CAPE) or convective precipitation. Markov chain Monte Carlo (MCMC) sampling estimates the final model to provide credible inference of effects, scores, and predictions. The selection of terms and MCMC sampling are applied for data of the year 2016, and out-of-sample performance is evaluated for 2017. The occurrence model outperforms a reference climatology – based on 7 years of data – up to a forecast horizon of 5 days. The flash count model is calibrated and also outperforms climatology for exceedance probabilities, quantiles, and full predictive distributions.</p>
ISSN:2364-3579
2364-3587