A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning

One of the foundational and key means of optimizing network service in the field of network security is traffic identification. Various data transmission encryption technologies have been widely employed in recent years. Wrongdoers usually bypass the defense of network security facilities through VP...

Full description

Bibliographic Details
Main Authors: Jie Cao, Xing-Liang Yuan, Ying Cui, Jia-Cheng Fan, Chin-Ling Chen
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/13/6434
_version_ 1797481005808877568
author Jie Cao
Xing-Liang Yuan
Ying Cui
Jia-Cheng Fan
Chin-Ling Chen
author_facet Jie Cao
Xing-Liang Yuan
Ying Cui
Jia-Cheng Fan
Chin-Ling Chen
author_sort Jie Cao
collection DOAJ
description One of the foundational and key means of optimizing network service in the field of network security is traffic identification. Various data transmission encryption technologies have been widely employed in recent years. Wrongdoers usually bypass the defense of network security facilities through VPN to carry out network intrusion and malicious attacks. The existing encrypted traffic identification system faces a severe problem as a result of this phenomenon. Previous encrypted traffic identification methods suffer from feature redundancy, data class imbalance, and low identification rate. To address these three problems, this paper proposes a VPN-encrypted traffic identification method based on ensemble learning. Firstly, aiming at the problem of feature redundancy in VPN-encrypted traffic features, a method of selecting encrypted traffic features based on mRMR is proposed; secondly, aiming at the problem of data class imbalance, improving the Xgboost identification model by using the focal loss function for the data class imbalance problem; Finally, in order to improve the identification rate of VPN-encrypted traffic identification methods, an ensemble learning model parameter optimization method based on optimal Bayesian is proposed. Experiments revealed that our proposed VPN-encrypted traffic identification method produced more desirable VPN-encrypted traffic identification outcomes. Meanwhile, using two encrypted traffic datasets, eight common identification algorithms are compared, and the method appears to be more accurate in identifying encrypted traffic.
first_indexed 2024-03-09T22:08:16Z
format Article
id doaj.art-399d4f7f77164c578f15cbc7d755f7d5
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T22:08:16Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-399d4f7f77164c578f15cbc7d755f7d52023-11-23T19:36:47ZengMDPI AGApplied Sciences2076-34172022-06-011213643410.3390/app12136434A VPN-Encrypted Traffic Identification Method Based on Ensemble LearningJie Cao0Xing-Liang Yuan1Ying Cui2Jia-Cheng Fan3Chin-Ling Chen4School of Computer Science, Northeast Electric Power University, Jilin 132012, ChinaSchool of Computer Science, Northeast Electric Power University, Jilin 132012, ChinaZhuhai Power Supply Bureau of Guangdong Power Grid Co., Ltd., Zhuhai 519000, ChinaSchool of Computer Science, Northeast Electric Power University, Jilin 132012, ChinaSchool of Information Engineering, Changchun Sci-Tech University, Changchun 130600, ChinaOne of the foundational and key means of optimizing network service in the field of network security is traffic identification. Various data transmission encryption technologies have been widely employed in recent years. Wrongdoers usually bypass the defense of network security facilities through VPN to carry out network intrusion and malicious attacks. The existing encrypted traffic identification system faces a severe problem as a result of this phenomenon. Previous encrypted traffic identification methods suffer from feature redundancy, data class imbalance, and low identification rate. To address these three problems, this paper proposes a VPN-encrypted traffic identification method based on ensemble learning. Firstly, aiming at the problem of feature redundancy in VPN-encrypted traffic features, a method of selecting encrypted traffic features based on mRMR is proposed; secondly, aiming at the problem of data class imbalance, improving the Xgboost identification model by using the focal loss function for the data class imbalance problem; Finally, in order to improve the identification rate of VPN-encrypted traffic identification methods, an ensemble learning model parameter optimization method based on optimal Bayesian is proposed. Experiments revealed that our proposed VPN-encrypted traffic identification method produced more desirable VPN-encrypted traffic identification outcomes. Meanwhile, using two encrypted traffic datasets, eight common identification algorithms are compared, and the method appears to be more accurate in identifying encrypted traffic.https://www.mdpi.com/2076-3417/12/13/6434VPN-encrypted traffic identificationensemble learningXgbooostfeature selectionBayesian optimization
spellingShingle Jie Cao
Xing-Liang Yuan
Ying Cui
Jia-Cheng Fan
Chin-Ling Chen
A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning
Applied Sciences
VPN-encrypted traffic identification
ensemble learning
Xgbooost
feature selection
Bayesian optimization
title A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning
title_full A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning
title_fullStr A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning
title_full_unstemmed A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning
title_short A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning
title_sort vpn encrypted traffic identification method based on ensemble learning
topic VPN-encrypted traffic identification
ensemble learning
Xgbooost
feature selection
Bayesian optimization
url https://www.mdpi.com/2076-3417/12/13/6434
work_keys_str_mv AT jiecao avpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT xingliangyuan avpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT yingcui avpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT jiachengfan avpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT chinlingchen avpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT jiecao vpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT xingliangyuan vpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT yingcui vpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT jiachengfan vpnencryptedtrafficidentificationmethodbasedonensemblelearning
AT chinlingchen vpnencryptedtrafficidentificationmethodbasedonensemblelearning