A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers

As the complexity and scale of the network environment increase continuously, various methods to detect attacks and intrusions from network traffic by classifying normal and abnormal network behaviors show their limitations. The number of network traffic signatures is increasing exponentially to the...

Full description

Bibliographic Details
Main Authors: Jaeik Cho, Seonghyeon Gong, Ken Choi
Format: Article
Language:English
Published: MDPI AG 2022-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/3/1011
_version_ 1797489426733989888
author Jaeik Cho
Seonghyeon Gong
Ken Choi
author_facet Jaeik Cho
Seonghyeon Gong
Ken Choi
author_sort Jaeik Cho
collection DOAJ
description As the complexity and scale of the network environment increase continuously, various methods to detect attacks and intrusions from network traffic by classifying normal and abnormal network behaviors show their limitations. The number of network traffic signatures is increasing exponentially to the extent that semi-realtime detection is not possible. However, machine learning-based intrusion detection only gives simple guidelines as simple contents of security events. This is why security data for a specific environment cannot be configured due to data noise, diversification, and continuous alteration of a system and network environments. Although machine learning is performed and evaluated using a generalized data set, its performance is expected to be similar in that specific network environment only. In this study, we propose a high-speed outlier detection method for a network dataset to customize the dataset in real-time for a continuously changing network environment. The proposed method uses an ensemble-based noise data filtering model using the voting results of 6 classifiers (decision tree, random forest, support vector machine, naive Bayes, k-nearest neighbors, and logistic regression) to reflect the distribution and various environmental characteristics of datasets. Moreover, to prove the performance of the proposed method, we experimented with the accuracy of attack detection by gradually reducing the noise data in the time series dataset. As a result of the experiment, the proposed method maintains a training dataset of a size capable of semi-real-time learning, which is 10% of the total training dataset, and at the same time, shows the same level of accuracy as a detection model using a large training dataset. The improved research results would be the basis for automatic tuning of network datasets and machine learning that can be applied to special-purpose environments and devices such as ICS environments.
first_indexed 2024-03-10T00:17:23Z
format Article
id doaj.art-d499985d36314560aa9795ae0e0f7fbf
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T00:17:23Z
publishDate 2022-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-d499985d36314560aa9795ae0e0f7fbf2023-11-23T15:50:19ZengMDPI AGApplied Sciences2076-34172022-01-01123101110.3390/app12031011A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple ClassifiersJaeik Cho0Seonghyeon Gong1Ken Choi2Illinois Institute of Technology, Chicago, IL 60616, USADepartment of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaIllinois Institute of Technology, Chicago, IL 60616, USAAs the complexity and scale of the network environment increase continuously, various methods to detect attacks and intrusions from network traffic by classifying normal and abnormal network behaviors show their limitations. The number of network traffic signatures is increasing exponentially to the extent that semi-realtime detection is not possible. However, machine learning-based intrusion detection only gives simple guidelines as simple contents of security events. This is why security data for a specific environment cannot be configured due to data noise, diversification, and continuous alteration of a system and network environments. Although machine learning is performed and evaluated using a generalized data set, its performance is expected to be similar in that specific network environment only. In this study, we propose a high-speed outlier detection method for a network dataset to customize the dataset in real-time for a continuously changing network environment. The proposed method uses an ensemble-based noise data filtering model using the voting results of 6 classifiers (decision tree, random forest, support vector machine, naive Bayes, k-nearest neighbors, and logistic regression) to reflect the distribution and various environmental characteristics of datasets. Moreover, to prove the performance of the proposed method, we experimented with the accuracy of attack detection by gradually reducing the noise data in the time series dataset. As a result of the experiment, the proposed method maintains a training dataset of a size capable of semi-real-time learning, which is 10% of the total training dataset, and at the same time, shows the same level of accuracy as a detection model using a large training dataset. The improved research results would be the basis for automatic tuning of network datasets and machine learning that can be applied to special-purpose environments and devices such as ICS environments.https://www.mdpi.com/2076-3417/12/3/1011noise reductionoutlier detectionintrusion detectionmachine learning for IDS
spellingShingle Jaeik Cho
Seonghyeon Gong
Ken Choi
A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers
Applied Sciences
noise reduction
outlier detection
intrusion detection
machine learning for IDS
title A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers
title_full A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers
title_fullStr A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers
title_full_unstemmed A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers
title_short A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers
title_sort study on high speed outlier detection method of network abnormal behavior data using heterogeneous multiple classifiers
topic noise reduction
outlier detection
intrusion detection
machine learning for IDS
url https://www.mdpi.com/2076-3417/12/3/1011
work_keys_str_mv AT jaeikcho astudyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers
AT seonghyeongong astudyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers
AT kenchoi astudyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers
AT jaeikcho studyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers
AT seonghyeongong studyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers
AT kenchoi studyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers