Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms

Previous methods for system intrusion detection have mainly consisted of those based on pattern matching that employs prior knowledge extracted from experts’ domain knowledge. However, pattern matching-based methods have a major drawback that it can be bypassed through various modified te...

Full description

Bibliographic Details
Main Authors:	Czangyeob Kim, Myeongjun Jang, Seungwan Seo, Kyeongchan Park, Pilsung Kang
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	System anomaly detection cyber security system log embedding advanced persistent threat ADFA-LD
Online Access:	https://ieeexplore.ieee.org/document/9399070/

_version_	1818445629625466880
author	Czangyeob Kim Myeongjun Jang Seungwan Seo Kyeongchan Park Pilsung Kang
author_facet	Czangyeob Kim Myeongjun Jang Seungwan Seo Kyeongchan Park Pilsung Kang
author_sort	Czangyeob Kim
collection	DOAJ
description	Previous methods for system intrusion detection have mainly consisted of those based on pattern matching that employs prior knowledge extracted from experts’ domain knowledge. However, pattern matching-based methods have a major drawback that it can be bypassed through various modified techniques. These advanced persistent threats cause limitation to the pattern matching-based detecting mechanism, because they are not only more sophisticated than usual threats but also specialized in the targeted attacking object. The defense mechanism should have to comprehend unusual phenomenons or behaviors to successfully handles the advanced threats. To achieve this, various security techniques based on machine learning have been developed recently. Among these, anomaly detection algorithms, which are trained in unsupervised fashion, are capable of reducing efforts of security experts and securing labeled dataset through post analysis. It is further possible to distinguish abnormal behaviors more precisely by training classification models if sufficient amounts of labeled dataset is obtained through post analysis of anomaly detection results. In this study, we proposed an end-to-end abnormal behavior detection method based on sequential information preserving log embedding algorithms and machine learning-based anomaly detection algorithms. Contrary to other machine learning based system anomaly detection models, which borrow domain experts’ knowledge to extract significant features from the log data, raw log data are transformed into a fixed size of continuous vector regardless of their length, and these vectors are used to train the anomaly detection models. Experimental results based on a real system call trace dataset, our proposed log embedding method with unsupervised anomaly detection model yielded a favorable performance, at most 0.8708 in terms of AUROC, and it can be further improved up to 0.9745 with supervised classification algorithms if sufficient labeled attack log data become available.
first_indexed	2024-12-14T19:34:52Z
format	Article
id	doaj.art-019f5acb19e943148b2bd622f276c242
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-14T19:34:52Z
publishDate	2021-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-019f5acb19e943148b2bd622f276c2422022-12-21T22:49:55ZengIEEEIEEE Access2169-35362021-01-019580885810110.1109/ACCESS.2021.30717639399070Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection AlgorithmsCzangyeob Kim0https://orcid.org/0000-0002-9784-2399Myeongjun Jang1https://orcid.org/0000-0002-9352-4799Seungwan Seo2https://orcid.org/0000-0001-5204-3350Kyeongchan Park3Pilsung Kang4https://orcid.org/0000-0001-7663-3937School of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaDepartment of Computer Science, University of Oxford, Oxford, U.KSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaPrevious methods for system intrusion detection have mainly consisted of those based on pattern matching that employs prior knowledge extracted from experts’ domain knowledge. However, pattern matching-based methods have a major drawback that it can be bypassed through various modified techniques. These advanced persistent threats cause limitation to the pattern matching-based detecting mechanism, because they are not only more sophisticated than usual threats but also specialized in the targeted attacking object. The defense mechanism should have to comprehend unusual phenomenons or behaviors to successfully handles the advanced threats. To achieve this, various security techniques based on machine learning have been developed recently. Among these, anomaly detection algorithms, which are trained in unsupervised fashion, are capable of reducing efforts of security experts and securing labeled dataset through post analysis. It is further possible to distinguish abnormal behaviors more precisely by training classification models if sufficient amounts of labeled dataset is obtained through post analysis of anomaly detection results. In this study, we proposed an end-to-end abnormal behavior detection method based on sequential information preserving log embedding algorithms and machine learning-based anomaly detection algorithms. Contrary to other machine learning based system anomaly detection models, which borrow domain experts’ knowledge to extract significant features from the log data, raw log data are transformed into a fixed size of continuous vector regardless of their length, and these vectors are used to train the anomaly detection models. Experimental results based on a real system call trace dataset, our proposed log embedding method with unsupervised anomaly detection model yielded a favorable performance, at most 0.8708 in terms of AUROC, and it can be further improved up to 0.9745 with supervised classification algorithms if sufficient labeled attack log data become available.https://ieeexplore.ieee.org/document/9399070/System anomaly detectioncyber securitysystem log embeddingadvanced persistent threatADFA-LD
spellingShingle	Czangyeob Kim Myeongjun Jang Seungwan Seo Kyeongchan Park Pilsung Kang Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms IEEE Access System anomaly detection cyber security system log embedding advanced persistent threat ADFA-LD
title	Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms
title_full	Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms
title_fullStr	Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms
title_full_unstemmed	Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms
title_short	Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms
title_sort	intrusion detection based on sequential information preserving log embedding methods and anomaly detection algorithms
topic	System anomaly detection cyber security system log embedding advanced persistent threat ADFA-LD
url	https://ieeexplore.ieee.org/document/9399070/
work_keys_str_mv	AT czangyeobkim intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms AT myeongjunjang intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms AT seungwanseo intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms AT kyeongchanpark intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms AT pilsungkang intrusiondetectionbasedonsequentialinformationpreservinglogembeddingmethodsandanomalydetectionalgorithms

Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms

Similar Items