Time series anomaly detection

Anomaly detection on time series data can be applied to many domains. It can be applied to machinery prognostics and health management (PHM) which is crucial to ensure a system’s reliability, increase operational safety and reduce maintenance cost. In this paper, anomaly detection is done on a...

Full description

Bibliographic Details
Main Author: Lek, Jie Kai
Other Authors: Kwoh Chee Keong
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175134
Description
Summary:Anomaly detection on time series data can be applied to many domains. It can be applied to machinery prognostics and health management (PHM) which is crucial to ensure a system’s reliability, increase operational safety and reduce maintenance cost. In this paper, anomaly detection is done on a time series aero-engine dataset by implementing an autoencoder. The time series dataset consists of multiple sensor readings of each engine and the Remaining Useful Life (RUL) of each engine at each point in time. The model is trained on a subset of the dataset where the RUL label is 130 (i.e. the maximum RUL of each engine), which represents the normal operating conditions of the engines, before the onset of degradation. The model is trained to minimize the difference between its original input and its reconstruction, quantified using the mean squared error. After training, the model is applied to the test dataset and the reconstruction error is calculated for each engine in the dataset. Anomalies are identified as points where reconstruction errors fall above the Interquartile Range and the first index of these anomalies will be interpreted as the point in time when anomalous behavior becomes prevalent, signaling the onset of degradation. In addition, after identifying the onset of degradation, a step is further taken to predict the RUL of each engine by implementing a deep CNN model. However, the degradation pattern varies across different entities and engines across its RUL and the engine’s RUL at a certain time step cannot be treated as a deterministic time value in most cases by simply setting the exact RUL labels as the labels for the data samples. Therefore, this project further explores the implementation of cycle-consistent learning, where it learns a new data representation subspace such that the degradation data of machines in similar health conditions can be well aligned across different entities to take into account the variation in degradation characteristics across different entities or engines. The model is trained and evaluated on the test dataset to predict the RUL of each engine. Performance of the autoencoder model is evaluated based on its predicted onset of degradation as compared to the actual onset of degradation. In addition, the performance of the CNN model with Cycle-consistent Learning is evaluated based on the RMSE. Lastly, the proposed model is compared against a deep CNN model as the baseline model where it is used to predict the RUL of the engines and the respective RMSE is obtained. The RMSE of the proposed model is lower than the RMSE of the baseline model.