LTmatch: A Method to Abstract Pattern from Unstructured Log

Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is importa...

Full description

Bibliographic Details
Main Authors: Xiaodong Wang, Yining Zhao, Haili Xiao, Xiaoning Wang, Xuebin Chi
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/11/5302
_version_ 1797531087480553472
author Xiaodong Wang
Yining Zhao
Haili Xiao
Xiaoning Wang
Xuebin Chi
author_facet Xiaodong Wang
Yining Zhao
Haili Xiao
Xiaoning Wang
Xuebin Chi
author_sort Xiaodong Wang
collection DOAJ
description Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods.
first_indexed 2024-03-10T10:39:08Z
format Article
id doaj.art-86b3aeb4dcca4e44a814f7604ce188a9
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T10:39:08Z
publishDate 2021-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-86b3aeb4dcca4e44a814f7604ce188a92023-11-21T23:07:20ZengMDPI AGApplied Sciences2076-34172021-06-011111530210.3390/app11115302LTmatch: A Method to Abstract Pattern from Unstructured LogXiaodong Wang0Yining Zhao1Haili Xiao2Xiaoning Wang3Xuebin Chi4Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaLogs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods.https://www.mdpi.com/2076-3417/11/11/5302log pattern extractionword matching rateLCSlog template
spellingShingle Xiaodong Wang
Yining Zhao
Haili Xiao
Xiaoning Wang
Xuebin Chi
LTmatch: A Method to Abstract Pattern from Unstructured Log
Applied Sciences
log pattern extraction
word matching rate
LCS
log template
title LTmatch: A Method to Abstract Pattern from Unstructured Log
title_full LTmatch: A Method to Abstract Pattern from Unstructured Log
title_fullStr LTmatch: A Method to Abstract Pattern from Unstructured Log
title_full_unstemmed LTmatch: A Method to Abstract Pattern from Unstructured Log
title_short LTmatch: A Method to Abstract Pattern from Unstructured Log
title_sort ltmatch a method to abstract pattern from unstructured log
topic log pattern extraction
word matching rate
LCS
log template
url https://www.mdpi.com/2076-3417/11/11/5302
work_keys_str_mv AT xiaodongwang ltmatchamethodtoabstractpatternfromunstructuredlog
AT yiningzhao ltmatchamethodtoabstractpatternfromunstructuredlog
AT hailixiao ltmatchamethodtoabstractpatternfromunstructuredlog
AT xiaoningwang ltmatchamethodtoabstractpatternfromunstructuredlog
AT xuebinchi ltmatchamethodtoabstractpatternfromunstructuredlog