LTmatch: A Method to Abstract Pattern from Unstructured Log
Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is importa...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-06-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/11/5302 |
_version_ | 1797531087480553472 |
---|---|
author | Xiaodong Wang Yining Zhao Haili Xiao Xiaoning Wang Xuebin Chi |
author_facet | Xiaodong Wang Yining Zhao Haili Xiao Xiaoning Wang Xuebin Chi |
author_sort | Xiaodong Wang |
collection | DOAJ |
description | Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods. |
first_indexed | 2024-03-10T10:39:08Z |
format | Article |
id | doaj.art-86b3aeb4dcca4e44a814f7604ce188a9 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T10:39:08Z |
publishDate | 2021-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-86b3aeb4dcca4e44a814f7604ce188a92023-11-21T23:07:20ZengMDPI AGApplied Sciences2076-34172021-06-011111530210.3390/app11115302LTmatch: A Method to Abstract Pattern from Unstructured LogXiaodong Wang0Yining Zhao1Haili Xiao2Xiaoning Wang3Xuebin Chi4Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaLogs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods.https://www.mdpi.com/2076-3417/11/11/5302log pattern extractionword matching rateLCSlog template |
spellingShingle | Xiaodong Wang Yining Zhao Haili Xiao Xiaoning Wang Xuebin Chi LTmatch: A Method to Abstract Pattern from Unstructured Log Applied Sciences log pattern extraction word matching rate LCS log template |
title | LTmatch: A Method to Abstract Pattern from Unstructured Log |
title_full | LTmatch: A Method to Abstract Pattern from Unstructured Log |
title_fullStr | LTmatch: A Method to Abstract Pattern from Unstructured Log |
title_full_unstemmed | LTmatch: A Method to Abstract Pattern from Unstructured Log |
title_short | LTmatch: A Method to Abstract Pattern from Unstructured Log |
title_sort | ltmatch a method to abstract pattern from unstructured log |
topic | log pattern extraction word matching rate LCS log template |
url | https://www.mdpi.com/2076-3417/11/11/5302 |
work_keys_str_mv | AT xiaodongwang ltmatchamethodtoabstractpatternfromunstructuredlog AT yiningzhao ltmatchamethodtoabstractpatternfromunstructuredlog AT hailixiao ltmatchamethodtoabstractpatternfromunstructuredlog AT xiaoningwang ltmatchamethodtoabstractpatternfromunstructuredlog AT xuebinchi ltmatchamethodtoabstractpatternfromunstructuredlog |