A Comparative Study of Machine Learning-based Approach for Network Traffic Classification

Internet usage has increased rapidly and become an essential part of human life, corresponding to the rapid development of network infrastructure in recent years. Thus, protecting users’ confidential information when joining the global network becomes one of the most significant considerations. Even...

Full description

Bibliographic Details
Main Authors: Kien Trang, An Hoang Nguyen
Format: Article
Language:English
Published: Universitas Negeri Malang 2022-01-01
Series:Knowledge Engineering and Data Science
Online Access:http://journal2.um.ac.id/index.php/keds/article/view/25393
_version_ 1798033934072676352
author Kien Trang
An Hoang Nguyen
author_facet Kien Trang
An Hoang Nguyen
author_sort Kien Trang
collection DOAJ
description Internet usage has increased rapidly and become an essential part of human life, corresponding to the rapid development of network infrastructure in recent years. Thus, protecting users’ confidential information when joining the global network becomes one of the most significant considerations. Even though multiple encryption algorithms and techniques have been applied in different parties, including internet providers, and web hosting, this situation also allows the hacker to attack the network system anonymously. Therefore, the significance of classifying network data streams to improve network system quality and security is attracting increasing study interests. This work introduces a machine learning-based approach to find the most suitable training model for network traffic classification tasks. Data pre-processing is first applied to normalize each feature type in the dataset. Different machine learning techniques, including k-Nearest Neighbors (KNN), Artificial Neural Network (ANN), and Random Forest (RF), are applied based on the normalized features in the classification phase. An open-access dataset ISCXVPN2016 is applied for this research, which includes two types of encryption (VPN and Non-VPN) and seven classes of traffic categories classes. Experimental results on the open dataset have shown that the proposed models have reached a high classification rate – over 85% in some cases, in which the RF model obtains the most refined results among the three techniques.
first_indexed 2024-04-11T20:37:17Z
format Article
id doaj.art-b56bae9035034fb798131dad8c07198d
institution Directory Open Access Journal
issn 2597-4602
2597-4637
language English
last_indexed 2024-04-11T20:37:17Z
publishDate 2022-01-01
publisher Universitas Negeri Malang
record_format Article
series Knowledge Engineering and Data Science
spelling doaj.art-b56bae9035034fb798131dad8c07198d2022-12-22T04:04:20ZengUniversitas Negeri MalangKnowledge Engineering and Data Science2597-46022597-46372022-01-014212813710.17977/um018v4i22021p128-1378221A Comparative Study of Machine Learning-based Approach for Network Traffic ClassificationKien Trang0An Hoang Nguyen1School of Electrical Engineering, International University, Ho Chi Minh City, VietnamSchool of Electrical Engineering, International University, Ho Chi Minh City, VietnamInternet usage has increased rapidly and become an essential part of human life, corresponding to the rapid development of network infrastructure in recent years. Thus, protecting users’ confidential information when joining the global network becomes one of the most significant considerations. Even though multiple encryption algorithms and techniques have been applied in different parties, including internet providers, and web hosting, this situation also allows the hacker to attack the network system anonymously. Therefore, the significance of classifying network data streams to improve network system quality and security is attracting increasing study interests. This work introduces a machine learning-based approach to find the most suitable training model for network traffic classification tasks. Data pre-processing is first applied to normalize each feature type in the dataset. Different machine learning techniques, including k-Nearest Neighbors (KNN), Artificial Neural Network (ANN), and Random Forest (RF), are applied based on the normalized features in the classification phase. An open-access dataset ISCXVPN2016 is applied for this research, which includes two types of encryption (VPN and Non-VPN) and seven classes of traffic categories classes. Experimental results on the open dataset have shown that the proposed models have reached a high classification rate – over 85% in some cases, in which the RF model obtains the most refined results among the three techniques.http://journal2.um.ac.id/index.php/keds/article/view/25393
spellingShingle Kien Trang
An Hoang Nguyen
A Comparative Study of Machine Learning-based Approach for Network Traffic Classification
Knowledge Engineering and Data Science
title A Comparative Study of Machine Learning-based Approach for Network Traffic Classification
title_full A Comparative Study of Machine Learning-based Approach for Network Traffic Classification
title_fullStr A Comparative Study of Machine Learning-based Approach for Network Traffic Classification
title_full_unstemmed A Comparative Study of Machine Learning-based Approach for Network Traffic Classification
title_short A Comparative Study of Machine Learning-based Approach for Network Traffic Classification
title_sort comparative study of machine learning based approach for network traffic classification
url http://journal2.um.ac.id/index.php/keds/article/view/25393
work_keys_str_mv AT kientrang acomparativestudyofmachinelearningbasedapproachfornetworktrafficclassification
AT anhoangnguyen acomparativestudyofmachinelearningbasedapproachfornetworktrafficclassification
AT kientrang comparativestudyofmachinelearningbasedapproachfornetworktrafficclassification
AT anhoangnguyen comparativestudyofmachinelearningbasedapproachfornetworktrafficclassification