Comparing Autoencoder and Isolation Forest in Network Anomaly Detection

Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signatur...

Full description

Bibliographic Details
Main Authors:	Timotej Smolen, Lenka Benova
Format:	Article
Language:	English
Published:	FRUCT 2023-05-01
Series:	Proceedings of the XXth Conference of Open Innovations Association FRUCT
Subjects:	anomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs
Online Access:	https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdf

_version_	1797807544434950144
author	Timotej Smolen Lenka Benova
author_facet	Timotej Smolen Lenka Benova
author_sort	Timotej Smolen
collection	DOAJ
description	Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest.
first_indexed	2024-03-13T06:24:10Z
format	Article
id	doaj.art-ce5f5be6f56a43f98ee90ec512348569
institution	Directory Open Access Journal
issn	2305-7254 2343-0737
language	English
last_indexed	2024-03-13T06:24:10Z
publishDate	2023-05-01
publisher	FRUCT
record_format	Article
series	Proceedings of the XXth Conference of Open Innovations Association FRUCT
spelling	doaj.art-ce5f5be6f56a43f98ee90ec5123485692023-06-09T11:41:51ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372023-05-0133127628210.23919/FRUCT58615.2023.10143005Comparing Autoencoder and Isolation Forest in Network Anomaly DetectionTimotej Smolen0Lenka Benova1Slovak University of Technology, Faculty of Informatics and Information TechnologiesSlovak University of Technology, Faculty of Informatics and Information TechnologiesAnomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest.https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdfanomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs
spellingShingle	Timotej Smolen Lenka Benova Comparing Autoencoder and Isolation Forest in Network Anomaly Detection Proceedings of the XXth Conference of Open Innovations Association FRUCT anomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs
title	Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_full	Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_fullStr	Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_full_unstemmed	Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_short	Comparing Autoencoder and Isolation Forest in Network Anomaly Detection
title_sort	comparing autoencoder and isolation forest in network anomaly detection
topic	anomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs
url	https://www.fruct.org/publications/volume-33/fruct33/files/Smo.pdf
work_keys_str_mv	AT timotejsmolen comparingautoencoderandisolationforestinnetworkanomalydetection AT lenkabenova comparingautoencoderandisolationforestinnetworkanomalydetection

Comparing Autoencoder and Isolation Forest in Network Anomaly Detection

Similar Items