Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data Center

Context—Anomaly detection in a data center is a challenging task, having to consider different services on various resources. Current literature shows the application of artificial intelligence and machine learning techniques to either log files or monitoring data: the former created by services at...

Full description

Bibliographic Details
Main Authors: Laura Viola, Elisabetta Ronchieri, Claudia Cavallaro
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Computers
Subjects:
Online Access:https://www.mdpi.com/2073-431X/11/8/117
_version_ 1797410773187690496
author Laura Viola
Elisabetta Ronchieri
Claudia Cavallaro
author_facet Laura Viola
Elisabetta Ronchieri
Claudia Cavallaro
author_sort Laura Viola
collection DOAJ
description Context—Anomaly detection in a data center is a challenging task, having to consider different services on various resources. Current literature shows the application of artificial intelligence and machine learning techniques to either log files or monitoring data: the former created by services at run time, while the latter produced by specific sensors directly on the physical or virtual machine. Objectives—We propose a model that exploits information both in log files and monitoring data to identify patterns and detect anomalies over time both at the service level and at the machine level. Methods—The key idea is to construct a specific dictionary for each log file which helps to extract anomalous <i>n</i>-grams in the feature matrix. Several techniques of Natural Language Processing, such as wordclouds and Topic modeling, have been used to enrich such dictionary. A clustering algorithm was then applied to the feature matrix to identify and group the various types of anomalies. On the other side, time series anomaly detection technique has been applied to sensors data in order to combine problems found in the log files with problems stored in the monitoring data. Several services (i.e., log files) running on the same machine have been grouped together with the monitoring metrics. Results—We have tested our approach on a real data center equipped with log files and monitoring data that can characterize the behaviour of physical and virtual resources in production. The data have been provided by the National Institute for Nuclear Physics in Italy. We have observed a correspondence between anomalies in log files and monitoring data, e.g., a decrease in memory usage or an increase in machine load. The results are extremely promising. Conclusions—Important outcomes have emerged thanks to the integration between these two types of data. Our model requires to integrate site administrators’ expertise in order to consider all critical scenarios in the data center and understand results properly.
first_indexed 2024-03-09T04:35:04Z
format Article
id doaj.art-948a65411c874b27a9255d50ae33726a
institution Directory Open Access Journal
issn 2073-431X
language English
last_indexed 2024-03-09T04:35:04Z
publishDate 2022-07-01
publisher MDPI AG
record_format Article
series Computers
spelling doaj.art-948a65411c874b27a9255d50ae33726a2023-12-03T13:29:26ZengMDPI AGComputers2073-431X2022-07-0111811710.3390/computers11080117Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data CenterLaura Viola0Elisabetta Ronchieri1Claudia Cavallaro2Department of Statistical Sciences, University of Bologna, 40126 Bologna, ItalyDepartment of Statistical Sciences, University of Bologna, 40126 Bologna, ItalyDepartment of Mathematics and Computer Science, University of Catania, 95124 Catania, ItalyContext—Anomaly detection in a data center is a challenging task, having to consider different services on various resources. Current literature shows the application of artificial intelligence and machine learning techniques to either log files or monitoring data: the former created by services at run time, while the latter produced by specific sensors directly on the physical or virtual machine. Objectives—We propose a model that exploits information both in log files and monitoring data to identify patterns and detect anomalies over time both at the service level and at the machine level. Methods—The key idea is to construct a specific dictionary for each log file which helps to extract anomalous <i>n</i>-grams in the feature matrix. Several techniques of Natural Language Processing, such as wordclouds and Topic modeling, have been used to enrich such dictionary. A clustering algorithm was then applied to the feature matrix to identify and group the various types of anomalies. On the other side, time series anomaly detection technique has been applied to sensors data in order to combine problems found in the log files with problems stored in the monitoring data. Several services (i.e., log files) running on the same machine have been grouped together with the monitoring metrics. Results—We have tested our approach on a real data center equipped with log files and monitoring data that can characterize the behaviour of physical and virtual resources in production. The data have been provided by the National Institute for Nuclear Physics in Italy. We have observed a correspondence between anomalies in log files and monitoring data, e.g., a decrease in memory usage or an increase in machine load. The results are extremely promising. Conclusions—Important outcomes have emerged thanks to the integration between these two types of data. Our model requires to integrate site administrators’ expertise in order to consider all critical scenarios in the data center and understand results properly.https://www.mdpi.com/2073-431X/11/8/117log analysismonitoring dataanomaly detectionnatural language processingtopic modelingclustering technique
spellingShingle Laura Viola
Elisabetta Ronchieri
Claudia Cavallaro
Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data Center
Computers
log analysis
monitoring data
anomaly detection
natural language processing
topic modeling
clustering technique
title Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data Center
title_full Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data Center
title_fullStr Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data Center
title_full_unstemmed Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data Center
title_short Combining Log Files and Monitoring Data to Detect Anomaly Patterns in a Data Center
title_sort combining log files and monitoring data to detect anomaly patterns in a data center
topic log analysis
monitoring data
anomaly detection
natural language processing
topic modeling
clustering technique
url https://www.mdpi.com/2073-431X/11/8/117
work_keys_str_mv AT lauraviola combininglogfilesandmonitoringdatatodetectanomalypatternsinadatacenter
AT elisabettaronchieri combininglogfilesandmonitoringdatatodetectanomalypatternsinadatacenter
AT claudiacavallaro combininglogfilesandmonitoringdatatodetectanomalypatternsinadatacenter