Summary: | Detecting cyber-security attacks is still a challenging task. This is due to the
evolving nature of the attacks. On the other hand, existing stream data learning
models with limited labelling have many limitations. Most importantly, algorithms
that suffer from a limited capability to adapt to the evolving nature of data
generated from network traffic are called concept drift. Hence, the algorithm
must overcome the problem of dynamic updates in the internal parameters or
counter the concept drift. Existing literature relies on offline trained models or
incremental learning models. The former suffers from partially or fully outdated
knowledge after drift occurrence, and the latter suffers from the constraints of
the pre-defined hyper-parameter of the model.
Thus, using neural network-based semi-supervised stream data learning is
inadequate due to capture the changes in the distribution and characteristics of
various classes of data while avoiding the effect of the outdated stored
knowledge in neural networks (NN). Therefore, we propose a prominent
approach that integrates each of the NN, a meta-heuristic based on an
evolutionary genetic algorithm (GA), and a core online-offline clustering (Core).
The system trains the NN on previously labelled data, and its knowledge is used
to calculate the core online-offline clustering block error. Genetic optimisation is
responsible for selecting the best parameters of the core clustering to minimise
the error.
In doing so, the old knowledge can be preserved dynamically to overcome the
concept drift. Nevertheless, the various components embedded in the hyperheuristic
models have created concern about the model's efficiency and whether
it is an over-fitting or under-fitting free performance. Therefore, the core classifier
in the hyper-heuristic approach of Intrusion Detection System (IDS) is developed
to the parallel structure NN. This enables more controllability of reaching optimal
learning without falling into sub-optimality because of over-fitting or under-fitting.
In addition, it is considered that existing solutions do not provide a feature driftaware
solution to the concept drift adaptable solution, which exploits the fact that
many of the original features are non-relevant. Here, the memory consumption
can be reduced by enabling a feature selection algorithm that excludes nonrelevant
features and preserves the relevant ones. the algorithm is developed
based on the variable length of the PSO. The reason for using variable length
searching is its effectiveness in searching for high dimensional space and
reducing the number of candidates' features. This is done by segmenting the
space into parts after sorting the features based on their relevance.
The algorithms were examined on two real datasets, namely, NSL-KDD and
Landsat. The experimental results showed that the accuracy of the algorithm
over the NSL-KDD dataset was 99.72%, with a memory reduction of 10%.
Furthermore, this was accomplished with only 25 neurons which means a
reduction of the number of neurons by a percentage of 75%. Hence, this
provides a handling of the effectiveness and efficiency dilemma, which is
considered a need in IoT networks. Other than that, a decrease in memory has
assisted in generating better accuracy performance with more memory
efficiency.
|