A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets

The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series tasks due to difficulties in fusing temporal data from multiple sources. However, in a...

Full description

Bibliographic Details
Main Authors: Ayman Elhalwagy, Tatiana Kalganova
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10290724/
_version_ 1797635397387288576
author Ayman Elhalwagy
Tatiana Kalganova
author_facet Ayman Elhalwagy
Tatiana Kalganova
author_sort Ayman Elhalwagy
collection DOAJ
description The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series tasks due to difficulties in fusing temporal data from multiple sources. However, in a commercial environment, generalisation can effectively utilise available data and computational power which is essential to Green AI, the sustainable development of AI models. This paper introduces “Dataset Fusion,” a novel dataset composition algorithm for fusing periodic signals from multiple homogeneous datasets whilst retaining unique features for generalised anomaly detection. The proposed approach, tested on a case study of three-phase current data from two different homogeneous Induction Motor (IM) fault datasets on anomaly detection, outperforms conventional training approaches with an Average F1 score of 0.879 and effectively generalises across all datasets. Furthermore, when tested with varying percentages of the training data, results show that using only 6.25% of the training data, translating to a 93.7% reduction in computational power, results in only a 4.04% decrease in performance, demonstrating the advantages of the proposed approach in terms of both performance and computational efficiency. Moreover, the algorithm’s effectiveness under imperfect conditions highlights its potential for use in real-world applications.
first_indexed 2024-03-11T12:21:29Z
format Article
id doaj.art-2c3a85b1776048b3a5120924f3e979ab
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-11T12:21:29Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-2c3a85b1776048b3a5120924f3e979ab2023-11-07T00:01:12ZengIEEEIEEE Access2169-35362023-01-011112121212123010.1109/ACCESS.2023.332672510290724A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series DatasetsAyman Elhalwagy0https://orcid.org/0000-0003-0772-9059Tatiana Kalganova1https://orcid.org/0000-0003-4859-7152Department of Electronic and Electrical Engineering, CEDPS, Brunel University London, Uxbridge, U.K.Department of Electronic and Electrical Engineering, CEDPS, Brunel University London, Uxbridge, U.K.The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series tasks due to difficulties in fusing temporal data from multiple sources. However, in a commercial environment, generalisation can effectively utilise available data and computational power which is essential to Green AI, the sustainable development of AI models. This paper introduces “Dataset Fusion,” a novel dataset composition algorithm for fusing periodic signals from multiple homogeneous datasets whilst retaining unique features for generalised anomaly detection. The proposed approach, tested on a case study of three-phase current data from two different homogeneous Induction Motor (IM) fault datasets on anomaly detection, outperforms conventional training approaches with an Average F1 score of 0.879 and effectively generalises across all datasets. Furthermore, when tested with varying percentages of the training data, results show that using only 6.25% of the training data, translating to a 93.7% reduction in computational power, results in only a 4.04% decrease in performance, demonstrating the advantages of the proposed approach in terms of both performance and computational efficiency. Moreover, the algorithm’s effectiveness under imperfect conditions highlights its potential for use in real-world applications.https://ieeexplore.ieee.org/document/10290724/Generalisationdataset fusiondata reductionanomaly detectionneural network traininggreen AI
spellingShingle Ayman Elhalwagy
Tatiana Kalganova
A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets
IEEE Access
Generalisation
dataset fusion
data reduction
anomaly detection
neural network training
green AI
title A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets
title_full A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets
title_fullStr A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets
title_full_unstemmed A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets
title_short A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets
title_sort dataset fusion algorithm for generalised anomaly detection in homogeneous periodic time series datasets
topic Generalisation
dataset fusion
data reduction
anomaly detection
neural network training
green AI
url https://ieeexplore.ieee.org/document/10290724/
work_keys_str_mv AT aymanelhalwagy adatasetfusionalgorithmforgeneralisedanomalydetectioninhomogeneousperiodictimeseriesdatasets
AT tatianakalganova adatasetfusionalgorithmforgeneralisedanomalydetectioninhomogeneousperiodictimeseriesdatasets
AT aymanelhalwagy datasetfusionalgorithmforgeneralisedanomalydetectioninhomogeneousperiodictimeseriesdatasets
AT tatianakalganova datasetfusionalgorithmforgeneralisedanomalydetectioninhomogeneousperiodictimeseriesdatasets