Online Data Center Traffic Classification Based on Inter-Flow Correlations
Today, increasing attention is being paid to Data Center (DC) traffic classification since these infrastructures have become the heart of a variety of time-sensitive and data-intensive service platforms. Classification provides the required tools for better understanding traffic patterns in order to...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9047871/ |
_version_ | 1818854844848406528 |
---|---|
author | Meriem Amina Si Saber Mehdi Ghorbani Abdolkhalegh Bayati Kim-Khoa Nguyen Mohamed Cheriet |
author_facet | Meriem Amina Si Saber Mehdi Ghorbani Abdolkhalegh Bayati Kim-Khoa Nguyen Mohamed Cheriet |
author_sort | Meriem Amina Si Saber |
collection | DOAJ |
description | Today, increasing attention is being paid to Data Center (DC) traffic classification since these infrastructures have become the heart of a variety of time-sensitive and data-intensive service platforms. Classification provides the required tools for better understanding traffic patterns in order to ensure high Quality of Service (QoS) performances and solve scalability problems. Unfortunately, existing classification algorithms cannot deal efficiently with two critical challenges in DC traffic: inter-class imbalance and critical time constraints. In this paper, we propose a novel correlation-based algorithm following a cost-sensitive approach combined with a Bagged Random Forest (BRF) ensemble algorithm, to address the inter-class imbalance problem while meeting time requirements in a data center environment. In this strategy, a new method based on Reverse k-Nearest Neighbors (RkNN) is proposed to capture the rebalancing weights expressing inter-flow correlations, in order to perform an online classification approach. We demonstrate the efficiency of the algorithm by comparing its performance to several existing methods from data level, algorithm level, and cost-sensitive strategies on four real-world datasets. The results reveal that the proposed algorithm outperforms most approaches in the different datasets in terms of precision, recall, F1 measure, AUC and Kappa, as opposed to other algorithms that result in either high precision with low recall and low precision and high recall causing congestion or resource over provisioning. |
first_indexed | 2024-12-19T07:59:10Z |
format | Article |
id | doaj.art-b3c5840261eb450abb3aea5af3c91682 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T07:59:10Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-b3c5840261eb450abb3aea5af3c916822022-12-21T20:29:55ZengIEEEIEEE Access2169-35362020-01-018604016041610.1109/ACCESS.2020.29836059047871Online Data Center Traffic Classification Based on Inter-Flow CorrelationsMeriem Amina Si Saber0https://orcid.org/0000-0002-3826-6758Mehdi Ghorbani1https://orcid.org/0000-0003-4519-8463Abdolkhalegh Bayati2https://orcid.org/0000-0001-7243-4594Kim-Khoa Nguyen3https://orcid.org/0000-0002-9354-7544Mohamed Cheriet4https://orcid.org/0000-0002-5246-7265École de Technologie Supérieure (ÉTS), University of Quebec, Montreal, QC, CanadaÉcole de Technologie Supérieure (ÉTS), University of Quebec, Montreal, QC, CanadaÉcole de Technologie Supérieure (ÉTS), University of Quebec, Montreal, QC, CanadaÉcole de Technologie Supérieure (ÉTS), University of Quebec, Montreal, QC, CanadaÉcole de Technologie Supérieure (ÉTS), University of Quebec, Montreal, QC, CanadaToday, increasing attention is being paid to Data Center (DC) traffic classification since these infrastructures have become the heart of a variety of time-sensitive and data-intensive service platforms. Classification provides the required tools for better understanding traffic patterns in order to ensure high Quality of Service (QoS) performances and solve scalability problems. Unfortunately, existing classification algorithms cannot deal efficiently with two critical challenges in DC traffic: inter-class imbalance and critical time constraints. In this paper, we propose a novel correlation-based algorithm following a cost-sensitive approach combined with a Bagged Random Forest (BRF) ensemble algorithm, to address the inter-class imbalance problem while meeting time requirements in a data center environment. In this strategy, a new method based on Reverse k-Nearest Neighbors (RkNN) is proposed to capture the rebalancing weights expressing inter-flow correlations, in order to perform an online classification approach. We demonstrate the efficiency of the algorithm by comparing its performance to several existing methods from data level, algorithm level, and cost-sensitive strategies on four real-world datasets. The results reveal that the proposed algorithm outperforms most approaches in the different datasets in terms of precision, recall, F1 measure, AUC and Kappa, as opposed to other algorithms that result in either high precision with low recall and low precision and high recall causing congestion or resource over provisioning.https://ieeexplore.ieee.org/document/9047871/Data centernetwork traffic classificationinterflow correlationensemble algorithmsrandom forestdata imbalance |
spellingShingle | Meriem Amina Si Saber Mehdi Ghorbani Abdolkhalegh Bayati Kim-Khoa Nguyen Mohamed Cheriet Online Data Center Traffic Classification Based on Inter-Flow Correlations IEEE Access Data center network traffic classification interflow correlation ensemble algorithms random forest data imbalance |
title | Online Data Center Traffic Classification Based on Inter-Flow Correlations |
title_full | Online Data Center Traffic Classification Based on Inter-Flow Correlations |
title_fullStr | Online Data Center Traffic Classification Based on Inter-Flow Correlations |
title_full_unstemmed | Online Data Center Traffic Classification Based on Inter-Flow Correlations |
title_short | Online Data Center Traffic Classification Based on Inter-Flow Correlations |
title_sort | online data center traffic classification based on inter flow correlations |
topic | Data center network traffic classification interflow correlation ensemble algorithms random forest data imbalance |
url | https://ieeexplore.ieee.org/document/9047871/ |
work_keys_str_mv | AT meriemaminasisaber onlinedatacentertrafficclassificationbasedoninterflowcorrelations AT mehdighorbani onlinedatacentertrafficclassificationbasedoninterflowcorrelations AT abdolkhaleghbayati onlinedatacentertrafficclassificationbasedoninterflowcorrelations AT kimkhoanguyen onlinedatacentertrafficclassificationbasedoninterflowcorrelations AT mohamedcheriet onlinedatacentertrafficclassificationbasedoninterflowcorrelations |