An Efficient Network Classification Based on Various-Widths Clustering and Semi-Supervised Stacking
Network traffic classification is basic tool for internet service providers, various government and private organisations to carry out investigation on network activities such as Intrusion Detection Systems (IDS), security monitoring, lawful interception and Quality of Service (QoS). Recent network...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9591648/ |
_version_ | 1818856395099865088 |
---|---|
author | Abdulmohsen Almalawi Adil Fahad |
author_facet | Abdulmohsen Almalawi Adil Fahad |
author_sort | Abdulmohsen Almalawi |
collection | DOAJ |
description | Network traffic classification is basic tool for internet service providers, various government and private organisations to carry out investigation on network activities such as Intrusion Detection Systems (IDS), security monitoring, lawful interception and Quality of Service (QoS). Recent network traffic classification approaches have used an extracted and predefined class label which come from multiple experts to build a robust network traffic classifier. However, keeping IP traffic classifiers up to date requires large amounts of new emerging labeled traffic flows which is often expensive and time-consuming. This paper proposes an efficient network classification (named Net-Stack) which inherits the advantages of various widths clustering and semi-supervised stacking to minimize the shortage of labeled flows, and accurately learn IP traffic features and knowledge. The Net-Stack approach consists of four stages. The first stage pre-processes the traffic data and removes noise traffic observations based on various widths clustering to select most representative observations from both the local and global perspective. The second stage generates strong discrimination ability for multiview representations of the original data using dimensionality reduction techniques. The third stage involves heterogeneous semi-supervised learning algorithms to exploit the complementary information contained in multiple views to refine the decision boundaries for each traffic class and get a low dimensional metadata representation. The final stage employs a meta-classifier and stacking approach to comprehensively learn from the metadata representation obtained in stage three for improving the generalization performance and predicting final classification decision. Experimental study on twelve traffic data sets shows the effectiveness of our proposed Net-Stack approach compared to the baseline methods when there is relatively less labelled training data available. |
first_indexed | 2024-12-19T08:23:49Z |
format | Article |
id | doaj.art-5254fb8113e145e7ba37f8d22cfb04b3 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T08:23:49Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-5254fb8113e145e7ba37f8d22cfb04b32022-12-21T20:29:20ZengIEEEIEEE Access2169-35362021-01-01915168115169610.1109/ACCESS.2021.31234519591648An Efficient Network Classification Based on Various-Widths Clustering and Semi-Supervised StackingAbdulmohsen Almalawi0https://orcid.org/0000-0002-4389-1339Adil Fahad1https://orcid.org/0000-0003-3728-7482School of Computer Science & Information Technology, King Abdulaziz University, Jeddah, Saudi ArabiaDepartment of Computer Science, College of Computer Science & Information Technology, Al Baha University, Al Baha, Saudi ArabiaNetwork traffic classification is basic tool for internet service providers, various government and private organisations to carry out investigation on network activities such as Intrusion Detection Systems (IDS), security monitoring, lawful interception and Quality of Service (QoS). Recent network traffic classification approaches have used an extracted and predefined class label which come from multiple experts to build a robust network traffic classifier. However, keeping IP traffic classifiers up to date requires large amounts of new emerging labeled traffic flows which is often expensive and time-consuming. This paper proposes an efficient network classification (named Net-Stack) which inherits the advantages of various widths clustering and semi-supervised stacking to minimize the shortage of labeled flows, and accurately learn IP traffic features and knowledge. The Net-Stack approach consists of four stages. The first stage pre-processes the traffic data and removes noise traffic observations based on various widths clustering to select most representative observations from both the local and global perspective. The second stage generates strong discrimination ability for multiview representations of the original data using dimensionality reduction techniques. The third stage involves heterogeneous semi-supervised learning algorithms to exploit the complementary information contained in multiple views to refine the decision boundaries for each traffic class and get a low dimensional metadata representation. The final stage employs a meta-classifier and stacking approach to comprehensively learn from the metadata representation obtained in stage three for improving the generalization performance and predicting final classification decision. Experimental study on twelve traffic data sets shows the effectiveness of our proposed Net-Stack approach compared to the baseline methods when there is relatively less labelled training data available.https://ieeexplore.ieee.org/document/9591648/Internet traffic classificationsemi-supervised learningmultiview |
spellingShingle | Abdulmohsen Almalawi Adil Fahad An Efficient Network Classification Based on Various-Widths Clustering and Semi-Supervised Stacking IEEE Access Internet traffic classification semi-supervised learning multiview |
title | An Efficient Network Classification Based on Various-Widths Clustering and Semi-Supervised Stacking |
title_full | An Efficient Network Classification Based on Various-Widths Clustering and Semi-Supervised Stacking |
title_fullStr | An Efficient Network Classification Based on Various-Widths Clustering and Semi-Supervised Stacking |
title_full_unstemmed | An Efficient Network Classification Based on Various-Widths Clustering and Semi-Supervised Stacking |
title_short | An Efficient Network Classification Based on Various-Widths Clustering and Semi-Supervised Stacking |
title_sort | efficient network classification based on various widths clustering and semi supervised stacking |
topic | Internet traffic classification semi-supervised learning multiview |
url | https://ieeexplore.ieee.org/document/9591648/ |
work_keys_str_mv | AT abdulmohsenalmalawi anefficientnetworkclassificationbasedonvariouswidthsclusteringandsemisupervisedstacking AT adilfahad anefficientnetworkclassificationbasedonvariouswidthsclusteringandsemisupervisedstacking AT abdulmohsenalmalawi efficientnetworkclassificationbasedonvariouswidthsclusteringandsemisupervisedstacking AT adilfahad efficientnetworkclassificationbasedonvariouswidthsclusteringandsemisupervisedstacking |