Comparing Autoencoder and Isolation Forest in Network Anomaly Detection

Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signatur...

Full description

Bibliographic Details
Main Authors: Timotej Smolen, Lenka Benova
Format: Article
Language:English
Published: FRUCT 2023-05-01
Series:Proceedings of the XXth Conference of Open Innovations Association FRUCT
Subjects:
Online Access:https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdf
_version_ 1797807544434950144
author Timotej Smolen
Lenka Benova
author_facet Timotej Smolen
Lenka Benova
author_sort Timotej Smolen
collection DOAJ
description Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest.
first_indexed 2024-03-13T06:24:10Z
format Article
id doaj.art-ce5f5be6f56a43f98ee90ec512348569
institution Directory Open Access Journal
issn 2305-7254
2343-0737
language English
last_indexed 2024-03-13T06:24:10Z
publishDate 2023-05-01
publisher FRUCT
record_format Article
series Proceedings of the XXth Conference of Open Innovations Association FRUCT
spelling doaj.art-ce5f5be6f56a43f98ee90ec5123485692023-06-09T11:41:51ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372023-05-0133127628210.23919/FRUCT58615.2023.10143005Comparing Autoencoder and Isolation Forest in Network Anomaly DetectionTimotej Smolen0Lenka Benova1Slovak University of Technology, Faculty of Informatics and Information TechnologiesSlovak University of Technology, Faculty of Informatics and Information TechnologiesAnomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest.https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdfanomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs
spellingShingle Timotej Smolen
Lenka Benova
Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
Proceedings of the XXth Conference of Open Innovations Association FRUCT
anomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs
title Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_full Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_fullStr Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_full_unstemmed Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_short Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_sort comparing autoencoder and isolation forest in network anomaly detection
topic anomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs
url https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdf
work_keys_str_mv AT timotejsmolen comparingautoencoderandisolationforestinnetworkanomalydetection
AT lenkabenova comparingautoencoderandisolationforestinnetworkanomalydetection