Wind Turbine Fault Detection Using Highly Imbalanced Real SCADA Data

Wind power is cleaner and less expensive compared to other alternative sources, and it has therefore become one of the most important energy sources worldwide. However, challenges related to the operation and maintenance of wind farms significantly contribute to the increase in their overall costs,...

Full description

Bibliographic Details
Main Authors: Cristian Velandia-Cardenas, Yolanda Vidal, Francesc Pozo
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Energies
Subjects:
Online Access:https://www.mdpi.com/1996-1073/14/6/1728
_version_ 1797540653157056512
author Cristian Velandia-Cardenas
Yolanda Vidal
Francesc Pozo
author_facet Cristian Velandia-Cardenas
Yolanda Vidal
Francesc Pozo
author_sort Cristian Velandia-Cardenas
collection DOAJ
description Wind power is cleaner and less expensive compared to other alternative sources, and it has therefore become one of the most important energy sources worldwide. However, challenges related to the operation and maintenance of wind farms significantly contribute to the increase in their overall costs, and, therefore, it is necessary to monitor the condition of each wind turbine on the farm and identify the different states of alarm. Common alarms are raised based on data acquired by a supervisory control and data acquisition (SCADA) system; however, this system generates a large number of false positive alerts, which must be handled to minimize inspection costs and perform preventive maintenance before actual critical or catastrophic failures occur. To this end, a fault detection methodology is proposed in this paper; in the proposed method, different data analysis and data processing techniques are applied to real SCADA data (imbalanced data) for improving the detection of alarms related to the temperature of the main gearbox of a wind turbine. An imbalanced dataset is a classification data set that contains skewed class proportions (more observations from one class than the other) which can cause a potential bias if it is not handled with caution. Furthermore, the dataset is time dependent introducing an additional variable to deal with when processing and splitting the data. These methods are aimed to reduce false positives and false negatives, and to demonstrate the effectiveness of well-applied preprocessing techniques for improving the performance of different machine learning algorithms.
first_indexed 2024-03-10T13:04:14Z
format Article
id doaj.art-ce118e7e40e34eaaa1a7ed4bfa25712d
institution Directory Open Access Journal
issn 1996-1073
language English
last_indexed 2024-03-10T13:04:14Z
publishDate 2021-03-01
publisher MDPI AG
record_format Article
series Energies
spelling doaj.art-ce118e7e40e34eaaa1a7ed4bfa25712d2023-11-21T11:17:21ZengMDPI AGEnergies1996-10732021-03-01146172810.3390/en14061728Wind Turbine Fault Detection Using Highly Imbalanced Real SCADA DataCristian Velandia-Cardenas0Yolanda Vidal1Francesc Pozo2Control, Modeling, Identification and Applications (CoDAlab), Department of Mathematics, Escola d’Enginyeria de Barcelona Est (EEBE), Campus Diagonal-Besòs (CDB), Universitat Politècnica de Catalunya (UPC), Eduard Maristany 16, 08019 Barcelona, SpainControl, Modeling, Identification and Applications (CoDAlab), Department of Mathematics, Escola d’Enginyeria de Barcelona Est (EEBE), Campus Diagonal-Besòs (CDB), Universitat Politècnica de Catalunya (UPC), Eduard Maristany 16, 08019 Barcelona, SpainControl, Modeling, Identification and Applications (CoDAlab), Department of Mathematics, Escola d’Enginyeria de Barcelona Est (EEBE), Campus Diagonal-Besòs (CDB), Universitat Politècnica de Catalunya (UPC), Eduard Maristany 16, 08019 Barcelona, SpainWind power is cleaner and less expensive compared to other alternative sources, and it has therefore become one of the most important energy sources worldwide. However, challenges related to the operation and maintenance of wind farms significantly contribute to the increase in their overall costs, and, therefore, it is necessary to monitor the condition of each wind turbine on the farm and identify the different states of alarm. Common alarms are raised based on data acquired by a supervisory control and data acquisition (SCADA) system; however, this system generates a large number of false positive alerts, which must be handled to minimize inspection costs and perform preventive maintenance before actual critical or catastrophic failures occur. To this end, a fault detection methodology is proposed in this paper; in the proposed method, different data analysis and data processing techniques are applied to real SCADA data (imbalanced data) for improving the detection of alarms related to the temperature of the main gearbox of a wind turbine. An imbalanced dataset is a classification data set that contains skewed class proportions (more observations from one class than the other) which can cause a potential bias if it is not handled with caution. Furthermore, the dataset is time dependent introducing an additional variable to deal with when processing and splitting the data. These methods are aimed to reduce false positives and false negatives, and to demonstrate the effectiveness of well-applied preprocessing techniques for improving the performance of different machine learning algorithms.https://www.mdpi.com/1996-1073/14/6/1728fault detectionmachine learningprincipal component analysisSCADAstructural health monitoringwind turbine
spellingShingle Cristian Velandia-Cardenas
Yolanda Vidal
Francesc Pozo
Wind Turbine Fault Detection Using Highly Imbalanced Real SCADA Data
Energies
fault detection
machine learning
principal component analysis
SCADA
structural health monitoring
wind turbine
title Wind Turbine Fault Detection Using Highly Imbalanced Real SCADA Data
title_full Wind Turbine Fault Detection Using Highly Imbalanced Real SCADA Data
title_fullStr Wind Turbine Fault Detection Using Highly Imbalanced Real SCADA Data
title_full_unstemmed Wind Turbine Fault Detection Using Highly Imbalanced Real SCADA Data
title_short Wind Turbine Fault Detection Using Highly Imbalanced Real SCADA Data
title_sort wind turbine fault detection using highly imbalanced real scada data
topic fault detection
machine learning
principal component analysis
SCADA
structural health monitoring
wind turbine
url https://www.mdpi.com/1996-1073/14/6/1728
work_keys_str_mv AT cristianvelandiacardenas windturbinefaultdetectionusinghighlyimbalancedrealscadadata
AT yolandavidal windturbinefaultdetectionusinghighlyimbalancedrealscadadata
AT francescpozo windturbinefaultdetectionusinghighlyimbalancedrealscadadata