Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms
Previous methods for system intrusion detection have mainly consisted of those based on pattern matching that employs prior knowledge extracted from experts’ domain knowledge. However, pattern matching-based methods have a major drawback that it can be bypassed through various modified te...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9399070/ |
_version_ | 1818445629625466880 |
---|---|
author | Czangyeob Kim Myeongjun Jang Seungwan Seo Kyeongchan Park Pilsung Kang |
author_facet | Czangyeob Kim Myeongjun Jang Seungwan Seo Kyeongchan Park Pilsung Kang |
author_sort | Czangyeob Kim |
collection | DOAJ |
description | Previous methods for system intrusion detection have mainly consisted of those based on pattern matching that employs prior knowledge extracted from experts’ domain knowledge. However, pattern matching-based methods have a major drawback that it can be bypassed through various modified techniques. These advanced persistent threats cause limitation to the pattern matching-based detecting mechanism, because they are not only more sophisticated than usual threats but also specialized in the targeted attacking object. The defense mechanism should have to comprehend unusual phenomenons or behaviors to successfully handles the advanced threats. To achieve this, various security techniques based on machine learning have been developed recently. Among these, anomaly detection algorithms, which are trained in unsupervised fashion, are capable of reducing efforts of security experts and securing labeled dataset through post analysis. It is further possible to distinguish abnormal behaviors more precisely by training classification models if sufficient amounts of labeled dataset is obtained through post analysis of anomaly detection results. In this study, we proposed an end-to-end abnormal behavior detection method based on sequential information preserving log embedding algorithms and machine learning-based anomaly detection algorithms. Contrary to other machine learning based system anomaly detection models, which borrow domain experts’ knowledge to extract significant features from the log data, raw log data are transformed into a fixed size of continuous vector regardless of their length, and these vectors are used to train the anomaly detection models. Experimental results based on a real system call trace dataset, our proposed log embedding method with unsupervised anomaly detection model yielded a favorable performance, at most 0.8708 in terms of AUROC, and it can be further improved up to 0.9745 with supervised classification algorithms if sufficient labeled attack log data become available. |
first_indexed | 2024-12-14T19:34:52Z |
format | Article |
id | doaj.art-019f5acb19e943148b2bd622f276c242 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-14T19:34:52Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-019f5acb19e943148b2bd622f276c2422022-12-21T22:49:55ZengIEEEIEEE Access2169-35362021-01-019580885810110.1109/ACCESS.2021.30717639399070Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection AlgorithmsCzangyeob Kim0https://orcid.org/0000-0002-9784-2399Myeongjun Jang1https://orcid.org/0000-0002-9352-4799Seungwan Seo2https://orcid.org/0000-0001-5204-3350Kyeongchan Park3Pilsung Kang4https://orcid.org/0000-0001-7663-3937School of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaDepartment of Computer Science, University of Oxford, Oxford, U.KSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaPrevious methods for system intrusion detection have mainly consisted of those based on pattern matching that employs prior knowledge extracted from experts’ domain knowledge. However, pattern matching-based methods have a major drawback that it can be bypassed through various modified techniques. These advanced persistent threats cause limitation to the pattern matching-based detecting mechanism, because they are not only more sophisticated than usual threats but also specialized in the targeted attacking object. The defense mechanism should have to comprehend unusual phenomenons or behaviors to successfully handles the advanced threats. To achieve this, various security techniques based on machine learning have been developed recently. Among these, anomaly detection algorithms, which are trained in unsupervised fashion, are capable of reducing efforts of security experts and securing labeled dataset through post analysis. It is further possible to distinguish abnormal behaviors more precisely by training classification models if sufficient amounts of labeled dataset is obtained through post analysis of anomaly detection results. In this study, we proposed an end-to-end abnormal behavior detection method based on sequential information preserving log embedding algorithms and machine learning-based anomaly detection algorithms. Contrary to other machine learning based system anomaly detection models, which borrow domain experts’ knowledge to extract significant features from the log data, raw log data are transformed into a fixed size of continuous vector regardless of their length, and these vectors are used to train the anomaly detection models. Experimental results based on a real system call trace dataset, our proposed log embedding method with unsupervised anomaly detection model yielded a favorable performance, at most 0.8708 in terms of AUROC, and it can be further improved up to 0.9745 with supervised classification algorithms if sufficient labeled attack log data become available.https://ieeexplore.ieee.org/document/9399070/System anomaly detectioncyber securitysystem log embeddingadvanced persistent threatADFA-LD |
spellingShingle | Czangyeob Kim Myeongjun Jang Seungwan Seo Kyeongchan Park Pilsung Kang Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms IEEE Access System anomaly detection cyber security system log embedding advanced persistent threat ADFA-LD |
title | Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms |
title_full | Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms |
title_fullStr | Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms |
title_full_unstemmed | Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms |
title_short | Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms |
title_sort | intrusion detection based on sequential information preserving log embedding methods and anomaly detection algorithms |
topic | System anomaly detection cyber security system log embedding advanced persistent threat ADFA-LD |
url | https://ieeexplore.ieee.org/document/9399070/ |
work_keys_str_mv | AT czangyeobkim intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms AT myeongjunjang intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms AT seungwanseo intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms AT kyeongchanpark intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms AT pilsungkang intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms |