The Use of Template Miners and Encryption in Log Message Compression

Presently, almost every computer software produces many log messages based on events and activities during the usage of the software. These files contain valuable runtime information that can be used in a variety of applications such as anomaly detection, error prediction, template mining, and so on...

Full description

Bibliographic Details
Main Authors: Péter Marjai, Péter Lehotay-Kéry, Attila Kiss
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Computers
Subjects:
Online Access:https://www.mdpi.com/2073-431X/10/7/83
_version_ 1797529031304806400
author Péter Marjai
Péter Lehotay-Kéry
Attila Kiss
author_facet Péter Marjai
Péter Lehotay-Kéry
Attila Kiss
author_sort Péter Marjai
collection DOAJ
description Presently, almost every computer software produces many log messages based on events and activities during the usage of the software. These files contain valuable runtime information that can be used in a variety of applications such as anomaly detection, error prediction, template mining, and so on. Usually, the generated log messages are raw, which means they have an unstructured format. This indicates that these messages have to be parsed before data mining models can be applied. After parsing, template miners can be applied on the data to retrieve the events occurring in the log file. These events are made from two parts, the template, which is the fixed part and is the same for all instances of the same event type, and the parameter part, which varies for all the instances. To decrease the size of the log messages, we use the mined templates to build a dictionary for the events, and only store the dictionary, the event ID, and the parameter list. We use six template miners to acquire the templates namely IPLoM, LenMa, LogMine, Spell, Drain, and MoLFI. In this paper, we evaluate the compression capacity of our dictionary method with the use of these algorithms. Since parameters could be sensitive information, we also encrypt the files after compression and measure the changes in file size. We also examine the speed of the log miner algorithms. Based on our experiments, LenMa has the best compression rate with an average of 67.4%; however, because of its high runtime, we would suggest the combination of our dictionary method with IPLoM and FFX, since it is the fastest of all methods, and it has a 57.7% compression rate.
first_indexed 2024-03-10T10:07:41Z
format Article
id doaj.art-9941e9308e99466ea59ece28d883de0a
institution Directory Open Access Journal
issn 2073-431X
language English
last_indexed 2024-03-10T10:07:41Z
publishDate 2021-06-01
publisher MDPI AG
record_format Article
series Computers
spelling doaj.art-9941e9308e99466ea59ece28d883de0a2023-11-22T01:24:01ZengMDPI AGComputers2073-431X2021-06-011078310.3390/computers10070083The Use of Template Miners and Encryption in Log Message CompressionPéter Marjai0Péter Lehotay-Kéry1Attila Kiss2Department of Information Systems, ELTE Eötvös Loránd University, 1117 Budapest, HungaryDepartment of Information Systems, ELTE Eötvös Loránd University, 1117 Budapest, HungaryDepartment of Information Systems, ELTE Eötvös Loránd University, 1117 Budapest, HungaryPresently, almost every computer software produces many log messages based on events and activities during the usage of the software. These files contain valuable runtime information that can be used in a variety of applications such as anomaly detection, error prediction, template mining, and so on. Usually, the generated log messages are raw, which means they have an unstructured format. This indicates that these messages have to be parsed before data mining models can be applied. After parsing, template miners can be applied on the data to retrieve the events occurring in the log file. These events are made from two parts, the template, which is the fixed part and is the same for all instances of the same event type, and the parameter part, which varies for all the instances. To decrease the size of the log messages, we use the mined templates to build a dictionary for the events, and only store the dictionary, the event ID, and the parameter list. We use six template miners to acquire the templates namely IPLoM, LenMa, LogMine, Spell, Drain, and MoLFI. In this paper, we evaluate the compression capacity of our dictionary method with the use of these algorithms. Since parameters could be sensitive information, we also encrypt the files after compression and measure the changes in file size. We also examine the speed of the log miner algorithms. Based on our experiments, LenMa has the best compression rate with an average of 67.4%; however, because of its high runtime, we would suggest the combination of our dictionary method with IPLoM and FFX, since it is the fastest of all methods, and it has a 57.7% compression rate.https://www.mdpi.com/2073-431X/10/7/83log file processingtemplate miningcompressionencryption
spellingShingle Péter Marjai
Péter Lehotay-Kéry
Attila Kiss
The Use of Template Miners and Encryption in Log Message Compression
Computers
log file processing
template mining
compression
encryption
title The Use of Template Miners and Encryption in Log Message Compression
title_full The Use of Template Miners and Encryption in Log Message Compression
title_fullStr The Use of Template Miners and Encryption in Log Message Compression
title_full_unstemmed The Use of Template Miners and Encryption in Log Message Compression
title_short The Use of Template Miners and Encryption in Log Message Compression
title_sort use of template miners and encryption in log message compression
topic log file processing
template mining
compression
encryption
url https://www.mdpi.com/2073-431X/10/7/83
work_keys_str_mv AT petermarjai theuseoftemplateminersandencryptioninlogmessagecompression
AT peterlehotaykery theuseoftemplateminersandencryptioninlogmessagecompression
AT attilakiss theuseoftemplateminersandencryptioninlogmessagecompression
AT petermarjai useoftemplateminersandencryptioninlogmessagecompression
AT peterlehotaykery useoftemplateminersandencryptioninlogmessagecompression
AT attilakiss useoftemplateminersandencryptioninlogmessagecompression