A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning
One of the foundational and key means of optimizing network service in the field of network security is traffic identification. Various data transmission encryption technologies have been widely employed in recent years. Wrongdoers usually bypass the defense of network security facilities through VP...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-06-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/13/6434 |
_version_ | 1797481005808877568 |
---|---|
author | Jie Cao Xing-Liang Yuan Ying Cui Jia-Cheng Fan Chin-Ling Chen |
author_facet | Jie Cao Xing-Liang Yuan Ying Cui Jia-Cheng Fan Chin-Ling Chen |
author_sort | Jie Cao |
collection | DOAJ |
description | One of the foundational and key means of optimizing network service in the field of network security is traffic identification. Various data transmission encryption technologies have been widely employed in recent years. Wrongdoers usually bypass the defense of network security facilities through VPN to carry out network intrusion and malicious attacks. The existing encrypted traffic identification system faces a severe problem as a result of this phenomenon. Previous encrypted traffic identification methods suffer from feature redundancy, data class imbalance, and low identification rate. To address these three problems, this paper proposes a VPN-encrypted traffic identification method based on ensemble learning. Firstly, aiming at the problem of feature redundancy in VPN-encrypted traffic features, a method of selecting encrypted traffic features based on mRMR is proposed; secondly, aiming at the problem of data class imbalance, improving the Xgboost identification model by using the focal loss function for the data class imbalance problem; Finally, in order to improve the identification rate of VPN-encrypted traffic identification methods, an ensemble learning model parameter optimization method based on optimal Bayesian is proposed. Experiments revealed that our proposed VPN-encrypted traffic identification method produced more desirable VPN-encrypted traffic identification outcomes. Meanwhile, using two encrypted traffic datasets, eight common identification algorithms are compared, and the method appears to be more accurate in identifying encrypted traffic. |
first_indexed | 2024-03-09T22:08:16Z |
format | Article |
id | doaj.art-399d4f7f77164c578f15cbc7d755f7d5 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T22:08:16Z |
publishDate | 2022-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-399d4f7f77164c578f15cbc7d755f7d52023-11-23T19:36:47ZengMDPI AGApplied Sciences2076-34172022-06-011213643410.3390/app12136434A VPN-Encrypted Traffic Identification Method Based on Ensemble LearningJie Cao0Xing-Liang Yuan1Ying Cui2Jia-Cheng Fan3Chin-Ling Chen4School of Computer Science, Northeast Electric Power University, Jilin 132012, ChinaSchool of Computer Science, Northeast Electric Power University, Jilin 132012, ChinaZhuhai Power Supply Bureau of Guangdong Power Grid Co., Ltd., Zhuhai 519000, ChinaSchool of Computer Science, Northeast Electric Power University, Jilin 132012, ChinaSchool of Information Engineering, Changchun Sci-Tech University, Changchun 130600, ChinaOne of the foundational and key means of optimizing network service in the field of network security is traffic identification. Various data transmission encryption technologies have been widely employed in recent years. Wrongdoers usually bypass the defense of network security facilities through VPN to carry out network intrusion and malicious attacks. The existing encrypted traffic identification system faces a severe problem as a result of this phenomenon. Previous encrypted traffic identification methods suffer from feature redundancy, data class imbalance, and low identification rate. To address these three problems, this paper proposes a VPN-encrypted traffic identification method based on ensemble learning. Firstly, aiming at the problem of feature redundancy in VPN-encrypted traffic features, a method of selecting encrypted traffic features based on mRMR is proposed; secondly, aiming at the problem of data class imbalance, improving the Xgboost identification model by using the focal loss function for the data class imbalance problem; Finally, in order to improve the identification rate of VPN-encrypted traffic identification methods, an ensemble learning model parameter optimization method based on optimal Bayesian is proposed. Experiments revealed that our proposed VPN-encrypted traffic identification method produced more desirable VPN-encrypted traffic identification outcomes. Meanwhile, using two encrypted traffic datasets, eight common identification algorithms are compared, and the method appears to be more accurate in identifying encrypted traffic.https://www.mdpi.com/2076-3417/12/13/6434VPN-encrypted traffic identificationensemble learningXgbooostfeature selectionBayesian optimization |
spellingShingle | Jie Cao Xing-Liang Yuan Ying Cui Jia-Cheng Fan Chin-Ling Chen A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning Applied Sciences VPN-encrypted traffic identification ensemble learning Xgbooost feature selection Bayesian optimization |
title | A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning |
title_full | A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning |
title_fullStr | A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning |
title_full_unstemmed | A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning |
title_short | A VPN-Encrypted Traffic Identification Method Based on Ensemble Learning |
title_sort | vpn encrypted traffic identification method based on ensemble learning |
topic | VPN-encrypted traffic identification ensemble learning Xgbooost feature selection Bayesian optimization |
url | https://www.mdpi.com/2076-3417/12/13/6434 |
work_keys_str_mv | AT jiecao avpnencryptedtrafficidentificationmethodbasedonensemblelearning AT xingliangyuan avpnencryptedtrafficidentificationmethodbasedonensemblelearning AT yingcui avpnencryptedtrafficidentificationmethodbasedonensemblelearning AT jiachengfan avpnencryptedtrafficidentificationmethodbasedonensemblelearning AT chinlingchen avpnencryptedtrafficidentificationmethodbasedonensemblelearning AT jiecao vpnencryptedtrafficidentificationmethodbasedonensemblelearning AT xingliangyuan vpnencryptedtrafficidentificationmethodbasedonensemblelearning AT yingcui vpnencryptedtrafficidentificationmethodbasedonensemblelearning AT jiachengfan vpnencryptedtrafficidentificationmethodbasedonensemblelearning AT chinlingchen vpnencryptedtrafficidentificationmethodbasedonensemblelearning |