Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signatur...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
FRUCT
2023-05-01
|
Series: | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
Subjects: | |
Online Access: | https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdf |
_version_ | 1797807544434950144 |
---|---|
author | Timotej Smolen Lenka Benova |
author_facet | Timotej Smolen Lenka Benova |
author_sort | Timotej Smolen |
collection | DOAJ |
description | Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest. |
first_indexed | 2024-03-13T06:24:10Z |
format | Article |
id | doaj.art-ce5f5be6f56a43f98ee90ec512348569 |
institution | Directory Open Access Journal |
issn | 2305-7254 2343-0737 |
language | English |
last_indexed | 2024-03-13T06:24:10Z |
publishDate | 2023-05-01 |
publisher | FRUCT |
record_format | Article |
series | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
spelling | doaj.art-ce5f5be6f56a43f98ee90ec5123485692023-06-09T11:41:51ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372023-05-0133127628210.23919/FRUCT58615.2023.10143005Comparing Autoencoder and Isolation Forest in Network Anomaly DetectionTimotej Smolen0Lenka Benova1Slovak University of Technology, Faculty of Informatics and Information TechnologiesSlovak University of Technology, Faculty of Informatics and Information TechnologiesAnomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest.https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdfanomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs |
spellingShingle | Timotej Smolen Lenka Benova Comparing Autoencoder and Isolation Forest in Network Anomaly Detection Proceedings of the XXth Conference of Open Innovations Association FRUCT anomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs |
title | Comparing Autoencoder and Isolation Forest in Network Anomaly Detection |
title_full | Comparing Autoencoder and Isolation Forest in Network Anomaly Detection |
title_fullStr | Comparing Autoencoder and Isolation Forest in Network Anomaly Detection |
title_full_unstemmed | Comparing Autoencoder and Isolation Forest in Network Anomaly Detection |
title_short | Comparing Autoencoder and Isolation Forest in Network Anomaly Detection |
title_sort | comparing autoencoder and isolation forest in network anomaly detection |
topic | anomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs |
url | https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdf |
work_keys_str_mv | AT timotejsmolen comparingautoencoderandisolationforestinnetworkanomalydetection AT lenkabenova comparingautoencoderandisolationforestinnetworkanomalydetection |