Explainable Learning-Based Timeout Optimization for Accurate and Efficient Elephant Flow Prediction in SDNs

Accurately and efficiently predicting elephant flows (elephants) is crucial for optimizing network performance and resource utilization. Current prediction approaches for software-defined networks (SDNs) typically rely on complete traffic and statistics moving from switches to controllers. This lead...

Full description

Bibliographic Details
Main Authors: Ling Xia Liao, Changqing Zhao, Roy Xiaorong Lai, Han-Chieh Chao
Format: Article
Language:English
Published: MDPI AG 2024-02-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/3/963
Description
Summary:Accurately and efficiently predicting elephant flows (elephants) is crucial for optimizing network performance and resource utilization. Current prediction approaches for software-defined networks (SDNs) typically rely on complete traffic and statistics moving from switches to controllers. This leads to an extra control channel bandwidth occupation and network delay. To address this issue, this paper proposes a prediction strategy based on incomplete traffic that is sampled by the timeouts for the installation or reactivation of flow entries. The strategy involves assigning a very short hard timeout (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mrow><mi>i</mi><mi>n</mi><mi>i</mi><mi>t</mi><mi>i</mi><mi>a</mi><mi>l</mi></mrow></msub></semantics></math></inline-formula>) to flow entries and then increasing it at a rate of <i>r</i> until flows are identified as elephants or out of their lifespans. Predicted elephants are switched to an idle timeout of 5 s. Logistic regression is used to model elephants based on a complete dataset. Bayesian optimization is then used to tune the trained model <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mrow><mi>i</mi><mi>n</mi><mi>i</mi><mi>t</mi><mi>i</mi><mi>a</mi><mi>l</mi></mrow></msub></semantics></math></inline-formula> and <i>r</i> over the incomplete dataset. The process of feature selection, model learning, and optimization is explained. An extensive evaluation shows that the proposed approach can achieve over 90% generalization accuracy over 7 different datasets, including campus, backbone, and the Internet of Things (IoT). Elephants can be correctly predicted for about half of their lifetime. The proposed approach can significantly reduce the controller–switch interaction in campus and IoT networks, although packet completion approaches may need to be applied in networks with a short mean packet inter-arrival time.
ISSN:1424-8220