An Integrated Method for Anomaly Detection From Massive System Logs
Logs are generated by systems to record the detailed runtime information about system operations, and log analysis plays an important role in anomaly detection at the host or network level. Most existing detection methods require a priori knowledge, which cannot be used to detect the new or unknown...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8371223/ |
_version_ | 1819169208611635200 |
---|---|
author | Zhaoli Liu Tao Qin Xiaohong Guan Hezhi Jiang Chenxu Wang |
author_facet | Zhaoli Liu Tao Qin Xiaohong Guan Hezhi Jiang Chenxu Wang |
author_sort | Zhaoli Liu |
collection | DOAJ |
description | Logs are generated by systems to record the detailed runtime information about system operations, and log analysis plays an important role in anomaly detection at the host or network level. Most existing detection methods require a priori knowledge, which cannot be used to detect the new or unknown anomalies. Moreover, the growing volume of logs poses new challenges to anomaly detection. In this paper, we propose an integrated method using K-prototype clustering and k-NN classification algorithms, which uses a novel clustering-filtering-refinement framework to perform anomaly detection from massive logs. First, we analyze the characteristics of system logs and extract 10 features based on the session information to characterize user behaviors effectively. Second, based on these extracted features, the K-prototype clustering algorithm is applied to partition the data set into different clusters. Then, the obvious normal events which usually present as highly coherent clusters are filtered out, and the others are regarded as anomaly candidates for further analysis. Finally, we design two new distance-based features to measure the local and global anomaly degrees for these anomaly candidates. Based on these two new features, we apply the k-NN classifier to generate accurate detection results. To verify the integrated method, we constructed a log collection and anomaly detection platform in the campus network center of Xi'an Jiaotong University. The experimental results based on the data sets collected from the platform show our method has high detection accuracy and low computational complexity. |
first_indexed | 2024-12-22T19:15:51Z |
format | Article |
id | doaj.art-b0b0f348a59943cd8278b3e564cec1d4 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-22T19:15:51Z |
publishDate | 2018-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-b0b0f348a59943cd8278b3e564cec1d42022-12-21T18:15:31ZengIEEEIEEE Access2169-35362018-01-016306023061110.1109/ACCESS.2018.28433368371223An Integrated Method for Anomaly Detection From Massive System LogsZhaoli Liu0Tao Qin1https://orcid.org/0000-0003-4874-2567Xiaohong Guan2Hezhi Jiang3Chenxu Wang4Key Laboratory for Intelligent Networks and Network Security of the Ministry of Education, Xi’an Jiaotong University, Xi’an, ChinaKey Laboratory for Intelligent Networks and Network Security of the Ministry of Education, Xi’an Jiaotong University, Xi’an, ChinaKey Laboratory for Intelligent Networks and Network Security of the Ministry of Education, Xi’an Jiaotong University, Xi’an, ChinaKey Laboratory for Intelligent Networks and Network Security of the Ministry of Education, Xi’an Jiaotong University, Xi’an, ChinaKey Laboratory for Intelligent Networks and Network Security of the Ministry of Education, Xi’an Jiaotong University, Xi’an, ChinaLogs are generated by systems to record the detailed runtime information about system operations, and log analysis plays an important role in anomaly detection at the host or network level. Most existing detection methods require a priori knowledge, which cannot be used to detect the new or unknown anomalies. Moreover, the growing volume of logs poses new challenges to anomaly detection. In this paper, we propose an integrated method using K-prototype clustering and k-NN classification algorithms, which uses a novel clustering-filtering-refinement framework to perform anomaly detection from massive logs. First, we analyze the characteristics of system logs and extract 10 features based on the session information to characterize user behaviors effectively. Second, based on these extracted features, the K-prototype clustering algorithm is applied to partition the data set into different clusters. Then, the obvious normal events which usually present as highly coherent clusters are filtered out, and the others are regarded as anomaly candidates for further analysis. Finally, we design two new distance-based features to measure the local and global anomaly degrees for these anomaly candidates. Based on these two new features, we apply the k-NN classifier to generate accurate detection results. To verify the integrated method, we constructed a log collection and anomaly detection platform in the campus network center of Xi'an Jiaotong University. The experimental results based on the data sets collected from the platform show our method has high detection accuracy and low computational complexity.https://ieeexplore.ieee.org/document/8371223/Anomaly detectionclustering-filtering-refinementK-prototype clusteringk-NN classificationmassive logs |
spellingShingle | Zhaoli Liu Tao Qin Xiaohong Guan Hezhi Jiang Chenxu Wang An Integrated Method for Anomaly Detection From Massive System Logs IEEE Access Anomaly detection clustering-filtering-refinement K-prototype clustering k-NN classification massive logs |
title | An Integrated Method for Anomaly Detection From Massive System Logs |
title_full | An Integrated Method for Anomaly Detection From Massive System Logs |
title_fullStr | An Integrated Method for Anomaly Detection From Massive System Logs |
title_full_unstemmed | An Integrated Method for Anomaly Detection From Massive System Logs |
title_short | An Integrated Method for Anomaly Detection From Massive System Logs |
title_sort | integrated method for anomaly detection from massive system logs |
topic | Anomaly detection clustering-filtering-refinement K-prototype clustering k-NN classification massive logs |
url | https://ieeexplore.ieee.org/document/8371223/ |
work_keys_str_mv | AT zhaoliliu anintegratedmethodforanomalydetectionfrommassivesystemlogs AT taoqin anintegratedmethodforanomalydetectionfrommassivesystemlogs AT xiaohongguan anintegratedmethodforanomalydetectionfrommassivesystemlogs AT hezhijiang anintegratedmethodforanomalydetectionfrommassivesystemlogs AT chenxuwang anintegratedmethodforanomalydetectionfrommassivesystemlogs AT zhaoliliu integratedmethodforanomalydetectionfrommassivesystemlogs AT taoqin integratedmethodforanomalydetectionfrommassivesystemlogs AT xiaohongguan integratedmethodforanomalydetectionfrommassivesystemlogs AT hezhijiang integratedmethodforanomalydetectionfrommassivesystemlogs AT chenxuwang integratedmethodforanomalydetectionfrommassivesystemlogs |