The Application of Deep Learning Imputation and Other Advanced Methods for Handling Missing Values in Network Intrusion Detection

In intelligent information systems data play a critical role. The issue of missing data is one of the commonplace problems occurring in data collected in the real world. The problem stems directly from the very nature of data collection. In this paper, the notion of handling missing values in a real...

Full description

Bibliographic Details
Main Authors: Mateusz Szczepański, Marek Pawlicki, Rafał Kozik, Michał Choraś
Format: Article
Language:English
Published: World Scientific Publishing 2023-02-01
Series:Vietnam Journal of Computer Science
Subjects:
Online Access:https://www.worldscientific.com/doi/10.1142/S2196888822500257
Description
Summary:In intelligent information systems data play a critical role. The issue of missing data is one of the commonplace problems occurring in data collected in the real world. The problem stems directly from the very nature of data collection. In this paper, the notion of handling missing values in a real-world application of computational intelligence is considered. Two experimental campaigns were conducted, evaluating different approaches to the missing values imputation on Random Forest-based classifiers, trained using modern cybersecurity benchmarks datasets: CICIDS2017 and IoT-23. In result of the experiments it transpired that the chosen algorithm for data imputation has a severe impact on the results of the classifier used for network intrusion detection. It also comes to light that one of the most popular approaches to handling missing data — complete case analysis — should never be used in cybersecurity.
ISSN:2196-8888
2196-8896