CyL-GHI: Global Horizontal Irradiance Dataset Containing 18 Years of Refined Data at 30-Min Granularity from 37 Stations Located in Castile and León (Spain)

Accurate solar forecasting lately relies on advances in the field of artificial intelligence and on the availability of databases with large amounts of information on meteorological variables. In this paper, we present the methodology applied to introduce a large-scale, public, and solar irradiance...

Full description

Bibliographic Details
Main Authors: Llinet Benavides Cesar, Miguel Ángel Manso Callejo, Calimanut-Ionut Cira, Ramon Alcarria
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Data
Subjects:
Online Access:https://www.mdpi.com/2306-5729/8/4/65
_version_ 1827745404940189696
author Llinet Benavides Cesar
Miguel Ángel Manso Callejo
Calimanut-Ionut Cira
Ramon Alcarria
author_facet Llinet Benavides Cesar
Miguel Ángel Manso Callejo
Calimanut-Ionut Cira
Ramon Alcarria
author_sort Llinet Benavides Cesar
collection DOAJ
description Accurate solar forecasting lately relies on advances in the field of artificial intelligence and on the availability of databases with large amounts of information on meteorological variables. In this paper, we present the methodology applied to introduce a large-scale, public, and solar irradiance dataset, CyL-GHI, containing refined data from 37 stations found within the Spanish region of Castile and León (Spanish: Castilla y León, or CyL). In addition to the data cleaning steps, the procedure also features steps that enable the addition of meteorological and geographical variables that complement the value of the initial data. The proposed dataset, resulting from applying the processing methodology, is delivered both in raw format and with the quality processing applied, and continuously covers 18 years (the period from 1 January 2002 to 31 December 2019), with a temporal resolution of 30 min. CyL-GHI can result in great importance in studies focused on the spatial-temporal characteristics of solar irradiance data, due to the geographical information considered that enables a regional analysis of the phenomena (the 37 stations cover a land area larger than 94,226 km<sup>2</sup>). Afterwards, three popular artificial intelligence algorithms were optimised and tested on CyL-GHI, their performance values being offered as baselines to compare other forecasting implementations. Furthermore, the ERA5 values corresponding to the studied area were analysed and compared with performance values delivered by the trained models. The inclusion of previous observations of neighbours as input to an optimised Random Forest model (applying a spatio-temporal approach) improved the predictive capability of the machine learning models by almost 3%.
first_indexed 2024-03-11T05:06:34Z
format Article
id doaj.art-eadd32a27a4a4947b47afccdf2681a18
institution Directory Open Access Journal
issn 2306-5729
language English
last_indexed 2024-03-11T05:06:34Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Data
spelling doaj.art-eadd32a27a4a4947b47afccdf2681a182023-11-17T18:53:19ZengMDPI AGData2306-57292023-03-01846510.3390/data8040065CyL-GHI: Global Horizontal Irradiance Dataset Containing 18 Years of Refined Data at 30-Min Granularity from 37 Stations Located in Castile and León (Spain)Llinet Benavides Cesar0Miguel Ángel Manso Callejo1Calimanut-Ionut Cira2Ramon Alcarria3Departamento de Ingeniería Topográfica y Cartográfica, Escuela Técnica Superior de Ingenieros en Topografía, Geodesia y Cartografía, Universidad Politécnica de Madrid, Calle Mercator, 2, 28031 Madrid, SpainDepartamento de Ingeniería Topográfica y Cartográfica, Escuela Técnica Superior de Ingenieros en Topografía, Geodesia y Cartografía, Universidad Politécnica de Madrid, Calle Mercator, 2, 28031 Madrid, SpainDepartamento de Ingeniería Topográfica y Cartográfica, Escuela Técnica Superior de Ingenieros en Topografía, Geodesia y Cartografía, Universidad Politécnica de Madrid, Calle Mercator, 2, 28031 Madrid, SpainDepartamento de Ingeniería Topográfica y Cartográfica, Escuela Técnica Superior de Ingenieros en Topografía, Geodesia y Cartografía, Universidad Politécnica de Madrid, Calle Mercator, 2, 28031 Madrid, SpainAccurate solar forecasting lately relies on advances in the field of artificial intelligence and on the availability of databases with large amounts of information on meteorological variables. In this paper, we present the methodology applied to introduce a large-scale, public, and solar irradiance dataset, CyL-GHI, containing refined data from 37 stations found within the Spanish region of Castile and León (Spanish: Castilla y León, or CyL). In addition to the data cleaning steps, the procedure also features steps that enable the addition of meteorological and geographical variables that complement the value of the initial data. The proposed dataset, resulting from applying the processing methodology, is delivered both in raw format and with the quality processing applied, and continuously covers 18 years (the period from 1 January 2002 to 31 December 2019), with a temporal resolution of 30 min. CyL-GHI can result in great importance in studies focused on the spatial-temporal characteristics of solar irradiance data, due to the geographical information considered that enables a regional analysis of the phenomena (the 37 stations cover a land area larger than 94,226 km<sup>2</sup>). Afterwards, three popular artificial intelligence algorithms were optimised and tested on CyL-GHI, their performance values being offered as baselines to compare other forecasting implementations. Furthermore, the ERA5 values corresponding to the studied area were analysed and compared with performance values delivered by the trained models. The inclusion of previous observations of neighbours as input to an optimised Random Forest model (applying a spatio-temporal approach) improved the predictive capability of the machine learning models by almost 3%.https://www.mdpi.com/2306-5729/8/4/65global horizontal irradianceweather measurementsextended areaSpain region
spellingShingle Llinet Benavides Cesar
Miguel Ángel Manso Callejo
Calimanut-Ionut Cira
Ramon Alcarria
CyL-GHI: Global Horizontal Irradiance Dataset Containing 18 Years of Refined Data at 30-Min Granularity from 37 Stations Located in Castile and León (Spain)
Data
global horizontal irradiance
weather measurements
extended area
Spain region
title CyL-GHI: Global Horizontal Irradiance Dataset Containing 18 Years of Refined Data at 30-Min Granularity from 37 Stations Located in Castile and León (Spain)
title_full CyL-GHI: Global Horizontal Irradiance Dataset Containing 18 Years of Refined Data at 30-Min Granularity from 37 Stations Located in Castile and León (Spain)
title_fullStr CyL-GHI: Global Horizontal Irradiance Dataset Containing 18 Years of Refined Data at 30-Min Granularity from 37 Stations Located in Castile and León (Spain)
title_full_unstemmed CyL-GHI: Global Horizontal Irradiance Dataset Containing 18 Years of Refined Data at 30-Min Granularity from 37 Stations Located in Castile and León (Spain)
title_short CyL-GHI: Global Horizontal Irradiance Dataset Containing 18 Years of Refined Data at 30-Min Granularity from 37 Stations Located in Castile and León (Spain)
title_sort cyl ghi global horizontal irradiance dataset containing 18 years of refined data at 30 min granularity from 37 stations located in castile and leon spain
topic global horizontal irradiance
weather measurements
extended area
Spain region
url https://www.mdpi.com/2306-5729/8/4/65
work_keys_str_mv AT llinetbenavidescesar cylghiglobalhorizontalirradiancedatasetcontaining18yearsofrefineddataat30mingranularityfrom37stationslocatedincastileandleonspain
AT miguelangelmansocallejo cylghiglobalhorizontalirradiancedatasetcontaining18yearsofrefineddataat30mingranularityfrom37stationslocatedincastileandleonspain
AT calimanutionutcira cylghiglobalhorizontalirradiancedatasetcontaining18yearsofrefineddataat30mingranularityfrom37stationslocatedincastileandleonspain
AT ramonalcarria cylghiglobalhorizontalirradiancedatasetcontaining18yearsofrefineddataat30mingranularityfrom37stationslocatedincastileandleonspain