An unsupervised anomaly detection framework for detecting anomalies in real time through network system’s log files analysis

Nowadays, in almost every computer system, log files are used to keep records of occurring events. Those log files are then used for analyzing and debugging system failures. Due to this important utility, researchers have worked on finding fast and efficient ways to detect anomalies in a computer sy...

Full description

Bibliographic Details
Main Authors: Vannel Zeufack, Donghyun Kim, Daehee Seo, Ahyoung Lee
Format: Article
Language:English
Published: Elsevier 2021-12-01
Series:High-Confidence Computing
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667295221000209
_version_ 1819146959250784256
author Vannel Zeufack
Donghyun Kim
Daehee Seo
Ahyoung Lee
author_facet Vannel Zeufack
Donghyun Kim
Daehee Seo
Ahyoung Lee
author_sort Vannel Zeufack
collection DOAJ
description Nowadays, in almost every computer system, log files are used to keep records of occurring events. Those log files are then used for analyzing and debugging system failures. Due to this important utility, researchers have worked on finding fast and efficient ways to detect anomalies in a computer system by analyzing its log records. Research in log-based anomaly detection can be divided into two main categories: batch log-based anomaly detection and streaming log- based anomaly detection. Batch log-based anomaly detection is computationally heavy and does not allow us to instantaneously detect anomalies. On the other hand, streaming anomaly detection allows for immediate alert. However, current streaming approaches are mainly supervised. In this work, we propose a fully unsupervised framework which can detect anomalies in real time. We test our framework on hdfs log files and successfully detect anomalies with an F-1 score of 83%.
first_indexed 2024-12-22T13:22:12Z
format Article
id doaj.art-28dc931eb5ee4eb880c289799c741bd5
institution Directory Open Access Journal
issn 2667-2952
language English
last_indexed 2024-12-22T13:22:12Z
publishDate 2021-12-01
publisher Elsevier
record_format Article
series High-Confidence Computing
spelling doaj.art-28dc931eb5ee4eb880c289799c741bd52022-12-21T18:24:26ZengElsevierHigh-Confidence Computing2667-29522021-12-0112100030An unsupervised anomaly detection framework for detecting anomalies in real time through network system’s log files analysisVannel Zeufack0Donghyun Kim1Daehee Seo2Ahyoung Lee3Department of Computer Science, Kennesaw State University, Marietta, GA 30060, USADepartment of Computer Science, Georgia State University, Atlanta, GA 30303, USAFaculty of Artificial Intelligence and Data Engineering, Sangmyung University, Seoul 03016, South KoreaCorresponding author.; Department of Computer Science, Kennesaw State University, Marietta, GA 30060, USANowadays, in almost every computer system, log files are used to keep records of occurring events. Those log files are then used for analyzing and debugging system failures. Due to this important utility, researchers have worked on finding fast and efficient ways to detect anomalies in a computer system by analyzing its log records. Research in log-based anomaly detection can be divided into two main categories: batch log-based anomaly detection and streaming log- based anomaly detection. Batch log-based anomaly detection is computationally heavy and does not allow us to instantaneously detect anomalies. On the other hand, streaming anomaly detection allows for immediate alert. However, current streaming approaches are mainly supervised. In this work, we propose a fully unsupervised framework which can detect anomalies in real time. We test our framework on hdfs log files and successfully detect anomalies with an F-1 score of 83%.http://www.sciencedirect.com/science/article/pii/S2667295221000209Anomaly detectionUnsupervised machine learningClusteringOPTICSLog analysis
spellingShingle Vannel Zeufack
Donghyun Kim
Daehee Seo
Ahyoung Lee
An unsupervised anomaly detection framework for detecting anomalies in real time through network system’s log files analysis
High-Confidence Computing
Anomaly detection
Unsupervised machine learning
Clustering
OPTICS
Log analysis
title An unsupervised anomaly detection framework for detecting anomalies in real time through network system’s log files analysis
title_full An unsupervised anomaly detection framework for detecting anomalies in real time through network system’s log files analysis
title_fullStr An unsupervised anomaly detection framework for detecting anomalies in real time through network system’s log files analysis
title_full_unstemmed An unsupervised anomaly detection framework for detecting anomalies in real time through network system’s log files analysis
title_short An unsupervised anomaly detection framework for detecting anomalies in real time through network system’s log files analysis
title_sort unsupervised anomaly detection framework for detecting anomalies in real time through network system s log files analysis
topic Anomaly detection
Unsupervised machine learning
Clustering
OPTICS
Log analysis
url http://www.sciencedirect.com/science/article/pii/S2667295221000209
work_keys_str_mv AT vannelzeufack anunsupervisedanomalydetectionframeworkfordetectinganomaliesinrealtimethroughnetworksystemslogfilesanalysis
AT donghyunkim anunsupervisedanomalydetectionframeworkfordetectinganomaliesinrealtimethroughnetworksystemslogfilesanalysis
AT daeheeseo anunsupervisedanomalydetectionframeworkfordetectinganomaliesinrealtimethroughnetworksystemslogfilesanalysis
AT ahyounglee anunsupervisedanomalydetectionframeworkfordetectinganomaliesinrealtimethroughnetworksystemslogfilesanalysis
AT vannelzeufack unsupervisedanomalydetectionframeworkfordetectinganomaliesinrealtimethroughnetworksystemslogfilesanalysis
AT donghyunkim unsupervisedanomalydetectionframeworkfordetectinganomaliesinrealtimethroughnetworksystemslogfilesanalysis
AT daeheeseo unsupervisedanomalydetectionframeworkfordetectinganomaliesinrealtimethroughnetworksystemslogfilesanalysis
AT ahyounglee unsupervisedanomalydetectionframeworkfordetectinganomaliesinrealtimethroughnetworksystemslogfilesanalysis