A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers
As the complexity and scale of the network environment increase continuously, various methods to detect attacks and intrusions from network traffic by classifying normal and abnormal network behaviors show their limitations. The number of network traffic signatures is increasing exponentially to the...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-01-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/3/1011 |
_version_ | 1797489426733989888 |
---|---|
author | Jaeik Cho Seonghyeon Gong Ken Choi |
author_facet | Jaeik Cho Seonghyeon Gong Ken Choi |
author_sort | Jaeik Cho |
collection | DOAJ |
description | As the complexity and scale of the network environment increase continuously, various methods to detect attacks and intrusions from network traffic by classifying normal and abnormal network behaviors show their limitations. The number of network traffic signatures is increasing exponentially to the extent that semi-realtime detection is not possible. However, machine learning-based intrusion detection only gives simple guidelines as simple contents of security events. This is why security data for a specific environment cannot be configured due to data noise, diversification, and continuous alteration of a system and network environments. Although machine learning is performed and evaluated using a generalized data set, its performance is expected to be similar in that specific network environment only. In this study, we propose a high-speed outlier detection method for a network dataset to customize the dataset in real-time for a continuously changing network environment. The proposed method uses an ensemble-based noise data filtering model using the voting results of 6 classifiers (decision tree, random forest, support vector machine, naive Bayes, k-nearest neighbors, and logistic regression) to reflect the distribution and various environmental characteristics of datasets. Moreover, to prove the performance of the proposed method, we experimented with the accuracy of attack detection by gradually reducing the noise data in the time series dataset. As a result of the experiment, the proposed method maintains a training dataset of a size capable of semi-real-time learning, which is 10% of the total training dataset, and at the same time, shows the same level of accuracy as a detection model using a large training dataset. The improved research results would be the basis for automatic tuning of network datasets and machine learning that can be applied to special-purpose environments and devices such as ICS environments. |
first_indexed | 2024-03-10T00:17:23Z |
format | Article |
id | doaj.art-d499985d36314560aa9795ae0e0f7fbf |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T00:17:23Z |
publishDate | 2022-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-d499985d36314560aa9795ae0e0f7fbf2023-11-23T15:50:19ZengMDPI AGApplied Sciences2076-34172022-01-01123101110.3390/app12031011A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple ClassifiersJaeik Cho0Seonghyeon Gong1Ken Choi2Illinois Institute of Technology, Chicago, IL 60616, USADepartment of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaIllinois Institute of Technology, Chicago, IL 60616, USAAs the complexity and scale of the network environment increase continuously, various methods to detect attacks and intrusions from network traffic by classifying normal and abnormal network behaviors show their limitations. The number of network traffic signatures is increasing exponentially to the extent that semi-realtime detection is not possible. However, machine learning-based intrusion detection only gives simple guidelines as simple contents of security events. This is why security data for a specific environment cannot be configured due to data noise, diversification, and continuous alteration of a system and network environments. Although machine learning is performed and evaluated using a generalized data set, its performance is expected to be similar in that specific network environment only. In this study, we propose a high-speed outlier detection method for a network dataset to customize the dataset in real-time for a continuously changing network environment. The proposed method uses an ensemble-based noise data filtering model using the voting results of 6 classifiers (decision tree, random forest, support vector machine, naive Bayes, k-nearest neighbors, and logistic regression) to reflect the distribution and various environmental characteristics of datasets. Moreover, to prove the performance of the proposed method, we experimented with the accuracy of attack detection by gradually reducing the noise data in the time series dataset. As a result of the experiment, the proposed method maintains a training dataset of a size capable of semi-real-time learning, which is 10% of the total training dataset, and at the same time, shows the same level of accuracy as a detection model using a large training dataset. The improved research results would be the basis for automatic tuning of network datasets and machine learning that can be applied to special-purpose environments and devices such as ICS environments.https://www.mdpi.com/2076-3417/12/3/1011noise reductionoutlier detectionintrusion detectionmachine learning for IDS |
spellingShingle | Jaeik Cho Seonghyeon Gong Ken Choi A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers Applied Sciences noise reduction outlier detection intrusion detection machine learning for IDS |
title | A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers |
title_full | A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers |
title_fullStr | A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers |
title_full_unstemmed | A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers |
title_short | A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers |
title_sort | study on high speed outlier detection method of network abnormal behavior data using heterogeneous multiple classifiers |
topic | noise reduction outlier detection intrusion detection machine learning for IDS |
url | https://www.mdpi.com/2076-3417/12/3/1011 |
work_keys_str_mv | AT jaeikcho astudyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers AT seonghyeongong astudyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers AT kenchoi astudyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers AT jaeikcho studyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers AT seonghyeongong studyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers AT kenchoi studyonhighspeedoutlierdetectionmethodofnetworkabnormalbehaviordatausingheterogeneousmultipleclassifiers |