Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation

While anomaly detection is very important in many domains, such as in cybersecurity, there are many rare anomalies or infrequent patterns in cybersecurity datasets. Detection of infrequent patterns is computationally expensive. Cybersecurity datasets consist of many features, mostly irrelevant, resu...

Full description

Bibliographic Details
Main Authors: A. N. M. Bazlur Rashid, Mohiuddin Ahmed, Al-Sakib Khan Pathan
Format: Article
Language:English
Published: MDPI AG 2021-04-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/9/3005
_version_ 1797536366962147328
author A. N. M. Bazlur Rashid
Mohiuddin Ahmed
Al-Sakib Khan Pathan
author_facet A. N. M. Bazlur Rashid
Mohiuddin Ahmed
Al-Sakib Khan Pathan
author_sort A. N. M. Bazlur Rashid
collection DOAJ
description While anomaly detection is very important in many domains, such as in cybersecurity, there are many rare anomalies or infrequent patterns in cybersecurity datasets. Detection of infrequent patterns is computationally expensive. Cybersecurity datasets consist of many features, mostly irrelevant, resulting in lower classification performance by machine learning algorithms. Hence, a feature selection (FS) approach, i.e., selecting relevant features only, is an essential preprocessing step in cybersecurity data analysis. Despite many FS approaches proposed in the literature, cooperative co-evolution (CC)-based FS approaches can be more suitable for cybersecurity data preprocessing considering the Big Data scenario. Accordingly, in this paper, we have applied our previously proposed CC-based FS with random feature grouping (CCFSRFG) to a benchmark cybersecurity dataset as the preprocessing step. The dataset with original features and the dataset with a reduced number of features were used for infrequent pattern detection. Experimental analysis was performed and evaluated using 10 unsupervised anomaly detection techniques. Therefore, the proposed infrequent pattern detection is termed <i>Unsupervised Infrequent Pattern Detection (UIPD)</i>. Then, we compared the experimental results with and without FS in terms of true positive rate (TPR). Experimental analysis indicates that the highest rate of TPR improvement was by <i>cluster-based local outlier factor (CBLOF)</i> of the <i>backdoor</i> infrequent pattern detection, and it was 385.91% when using FS. Furthermore, the highest overall infrequent pattern detection TPR was improved by 61.47% for all infrequent patterns using <i>clustering-based multivariate Gaussian outlier score (CMGOS)</i> with FS.
first_indexed 2024-03-10T11:58:41Z
format Article
id doaj.art-2a5d60fe1cf84ce084d6985aba1f89ac
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-10T11:58:41Z
publishDate 2021-04-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-2a5d60fe1cf84ce084d6985aba1f89ac2023-11-21T17:03:40ZengMDPI AGSensors1424-82202021-04-01219300510.3390/s21093005Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary ComputationA. N. M. Bazlur Rashid0Mohiuddin Ahmed1Al-Sakib Khan Pathan2School of Science, Edith Cowan University, Joondalup, WA 6027, AustraliaSchool of Science, Edith Cowan University, Joondalup, WA 6027, AustraliaDepartment of Computer Science and Engineering, Independent University, Dhaka 1229, BangladeshWhile anomaly detection is very important in many domains, such as in cybersecurity, there are many rare anomalies or infrequent patterns in cybersecurity datasets. Detection of infrequent patterns is computationally expensive. Cybersecurity datasets consist of many features, mostly irrelevant, resulting in lower classification performance by machine learning algorithms. Hence, a feature selection (FS) approach, i.e., selecting relevant features only, is an essential preprocessing step in cybersecurity data analysis. Despite many FS approaches proposed in the literature, cooperative co-evolution (CC)-based FS approaches can be more suitable for cybersecurity data preprocessing considering the Big Data scenario. Accordingly, in this paper, we have applied our previously proposed CC-based FS with random feature grouping (CCFSRFG) to a benchmark cybersecurity dataset as the preprocessing step. The dataset with original features and the dataset with a reduced number of features were used for infrequent pattern detection. Experimental analysis was performed and evaluated using 10 unsupervised anomaly detection techniques. Therefore, the proposed infrequent pattern detection is termed <i>Unsupervised Infrequent Pattern Detection (UIPD)</i>. Then, we compared the experimental results with and without FS in terms of true positive rate (TPR). Experimental analysis indicates that the highest rate of TPR improvement was by <i>cluster-based local outlier factor (CBLOF)</i> of the <i>backdoor</i> infrequent pattern detection, and it was 385.91% when using FS. Furthermore, the highest overall infrequent pattern detection TPR was improved by 61.47% for all infrequent patterns using <i>clustering-based multivariate Gaussian outlier score (CMGOS)</i> with FS.https://www.mdpi.com/1424-8220/21/9/3005infrequentrarepattern detectionnetwork trafficunsupervisedfeature selection
spellingShingle A. N. M. Bazlur Rashid
Mohiuddin Ahmed
Al-Sakib Khan Pathan
Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
Sensors
infrequent
rare
pattern detection
network traffic
unsupervised
feature selection
title Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_full Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_fullStr Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_full_unstemmed Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_short Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_sort infrequent pattern detection for reliable network traffic analysis using robust evolutionary computation
topic infrequent
rare
pattern detection
network traffic
unsupervised
feature selection
url https://www.mdpi.com/1424-8220/21/9/3005
work_keys_str_mv AT anmbazlurrashid infrequentpatterndetectionforreliablenetworktrafficanalysisusingrobustevolutionarycomputation
AT mohiuddinahmed infrequentpatterndetectionforreliablenetworktrafficanalysisusingrobustevolutionarycomputation
AT alsakibkhanpathan infrequentpatterndetectionforreliablenetworktrafficanalysisusingrobustevolutionarycomputation