A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets

The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series tasks due to difficulties in fusing temporal data from multiple sources. However, in a...

Full description

Bibliographic Details
Main Authors: Ayman Elhalwagy, Tatiana Kalganova
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10290724/
Description
Summary:The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series tasks due to difficulties in fusing temporal data from multiple sources. However, in a commercial environment, generalisation can effectively utilise available data and computational power which is essential to Green AI, the sustainable development of AI models. This paper introduces “Dataset Fusion,” a novel dataset composition algorithm for fusing periodic signals from multiple homogeneous datasets whilst retaining unique features for generalised anomaly detection. The proposed approach, tested on a case study of three-phase current data from two different homogeneous Induction Motor (IM) fault datasets on anomaly detection, outperforms conventional training approaches with an Average F1 score of 0.879 and effectively generalises across all datasets. Furthermore, when tested with varying percentages of the training data, results show that using only 6.25% of the training data, translating to a 93.7% reduction in computational power, results in only a 4.04% decrease in performance, demonstrating the advantages of the proposed approach in terms of both performance and computational efficiency. Moreover, the algorithm’s effectiveness under imperfect conditions highlights its potential for use in real-world applications.
ISSN:2169-3536