Price Forecasting for the Balancing Energy Market Using Machine-Learning Regression

The importance of price forecasting has gained attention over the last few years, with the growth of aggregators and the general opening of the European electricity markets. Market participants manage a tradeoff between, bidding in a lower price market (day-ahead), but with typically higher volume,...

Full description

Bibliographic Details
Main Authors: Alexandre Lucas, Konstantinos Pegios, Evangelos Kotsakis, Dan Clarke
Format: Article
Language:English
Published: MDPI AG 2020-10-01
Series:Energies
Subjects:
Online Access:https://www.mdpi.com/1996-1073/13/20/5420
Description
Summary:The importance of price forecasting has gained attention over the last few years, with the growth of aggregators and the general opening of the European electricity markets. Market participants manage a tradeoff between, bidding in a lower price market (day-ahead), but with typically higher volume, or aiming for a lower volume market but with potentially higher returns (balance energy market). Companies try to forecast the extremes of revenues or prices, in order to manage risk and opportunity, assigning their assets in an optimal way. It is thought that in general, electricity markets have quasi-deterministic principles, rather than being based on speculation, hence the desire to forecast the price based on variables that can describe the outcome of the market. Many studies address this problem from a statistical approach or by performing multiple-variable regressions, but they very often focus only on the time series analysis. In 2019, the Loss of Load Probability (LOLP) was made available in the UK for the first time. Taking this opportunity, this study focusses on five LOLP variables (with different time-ahead estimations) and other quasi-deterministic variables, to explain the price behavior of a multi-variable regression model. These include base production, system load, solar and wind generation, seasonality, day-ahead price and imbalance volume contributions. Three machine-learning algorithms were applied to test for performance, Gradient Boosting (GB), Random Forest (RF) and XGBoost. XGBoost presented higher performance and so it was chosen for the implementation of the real time forecast step. The model returns a Mean Absolute Error (MAE) of 7.89 £/MWh, a coefficient of determination (R2 score) of 76.8% and a Mean Squared Error (MSE) of 124.74. The variables that contribute the most to the model are the Net Imbalance Volume, the LOLP (aggregated), the month and the De-rated margins (aggregated) with 28.6%, 27.5%, 14.0%, and 8.9% of weight on feature importance respectively.
ISSN:1996-1073