Cluster-Based Analogue Ensembles for Hindcasting with Multistations

The Analogue Ensemble (AnEn) method enables the reconstruction of meteorological observations or deterministic predictions for a certain variable and station by using data from the same station or from other nearby stations. However, depending on the dimension and granularity of the historical datas...

Full description

Bibliographic Details
Main Authors: Carlos Balsa, Carlos Veiga Rodrigues, Leonardo Araújo, José Rufino
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Computation
Subjects:
Online Access:https://www.mdpi.com/2079-3197/10/6/91
_version_ 1797488644035969024
author Carlos Balsa
Carlos Veiga Rodrigues
Leonardo Araújo
José Rufino
author_facet Carlos Balsa
Carlos Veiga Rodrigues
Leonardo Araújo
José Rufino
author_sort Carlos Balsa
collection DOAJ
description The Analogue Ensemble (AnEn) method enables the reconstruction of meteorological observations or deterministic predictions for a certain variable and station by using data from the same station or from other nearby stations. However, depending on the dimension and granularity of the historical datasets used for the reconstruction, this method may be computationally very demanding even if parallelization is used. In this work, the classical AnEn method is modified so that analogues are determined using K-means clustering. The proposed combined approach allows the use of several predictors in a dependent or independent way. As a result of the flexibility and adaptability of this new approach, it is necessary to define several parameters and algorithmic options. The effects of the critical parameters and main options were tested on a large dataset from real-world meteorological stations. The results show that adequate monitoring and tuning of the new method allows for a considerable improvement of the computational performance of the reconstruction task while keeping the accuracy of the results. Compared to the classical AnEn method, the proposed variant is at least 15-times faster when processing is serial. Both approaches benefit from parallel processing, with the K-means variant also being always faster than the classic method under that execution regime (albeit its performance advantage diminishes as more CPU threads are used).
first_indexed 2024-03-10T00:05:14Z
format Article
id doaj.art-5df9573db49f4677aed76ef6e860e4cd
institution Directory Open Access Journal
issn 2079-3197
language English
last_indexed 2024-03-10T00:05:14Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Computation
spelling doaj.art-5df9573db49f4677aed76ef6e860e4cd2023-11-23T16:09:36ZengMDPI AGComputation2079-31972022-06-011069110.3390/computation10060091Cluster-Based Analogue Ensembles for Hindcasting with MultistationsCarlos Balsa0Carlos Veiga Rodrigues1Leonardo Araújo2José Rufino3Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, 5300-253 Bragança, PortugalVestas Wind Systems A/S, Vestas Technology Centre Porto, 4465-671 Leça do Balio, PortugalUniversidade Tecnológica Federal do Paraná, Campus de Ponta Grossa, Ponta Grossa 84017-220, BrazilResearch Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, 5300-253 Bragança, PortugalThe Analogue Ensemble (AnEn) method enables the reconstruction of meteorological observations or deterministic predictions for a certain variable and station by using data from the same station or from other nearby stations. However, depending on the dimension and granularity of the historical datasets used for the reconstruction, this method may be computationally very demanding even if parallelization is used. In this work, the classical AnEn method is modified so that analogues are determined using K-means clustering. The proposed combined approach allows the use of several predictors in a dependent or independent way. As a result of the flexibility and adaptability of this new approach, it is necessary to define several parameters and algorithmic options. The effects of the critical parameters and main options were tested on a large dataset from real-world meteorological stations. The results show that adequate monitoring and tuning of the new method allows for a considerable improvement of the computational performance of the reconstruction task while keeping the accuracy of the results. Compared to the classical AnEn method, the proposed variant is at least 15-times faster when processing is serial. Both approaches benefit from parallel processing, with the K-means variant also being always faster than the classic method under that execution regime (albeit its performance advantage diminishes as more CPU threads are used).https://www.mdpi.com/2079-3197/10/6/91hindcastingmeteorological datasetanalogue ensembleK-meanstime-series
spellingShingle Carlos Balsa
Carlos Veiga Rodrigues
Leonardo Araújo
José Rufino
Cluster-Based Analogue Ensembles for Hindcasting with Multistations
Computation
hindcasting
meteorological dataset
analogue ensemble
K-means
time-series
title Cluster-Based Analogue Ensembles for Hindcasting with Multistations
title_full Cluster-Based Analogue Ensembles for Hindcasting with Multistations
title_fullStr Cluster-Based Analogue Ensembles for Hindcasting with Multistations
title_full_unstemmed Cluster-Based Analogue Ensembles for Hindcasting with Multistations
title_short Cluster-Based Analogue Ensembles for Hindcasting with Multistations
title_sort cluster based analogue ensembles for hindcasting with multistations
topic hindcasting
meteorological dataset
analogue ensemble
K-means
time-series
url https://www.mdpi.com/2079-3197/10/6/91
work_keys_str_mv AT carlosbalsa clusterbasedanalogueensemblesforhindcastingwithmultistations
AT carlosveigarodrigues clusterbasedanalogueensemblesforhindcastingwithmultistations
AT leonardoaraujo clusterbasedanalogueensemblesforhindcastingwithmultistations
AT joserufino clusterbasedanalogueensemblesforhindcastingwithmultistations