Split computing: DNN inference partition with load balancing in IoT-edge platform for beyond 5G

In the era of beyond 5G technology, it is expected that more and more applications can use deep neural network (DNN) models for different purposes with minimum inference time which is highly important for providing a great user experience in the Internet of Things (IoT) use-cases. However, due to ha...

Full description

Bibliographic Details
Main Authors: Jyotirmoy Karjee, Praveen Naik S, Kartik Anand, Vanamala N. Bhargav
Format: Article
Language:English
Published: Elsevier 2022-10-01
Series:Measurement: Sensors
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2665917422000435
Description
Summary:In the era of beyond 5G technology, it is expected that more and more applications can use deep neural network (DNN) models for different purposes with minimum inference time which is highly important for providing a great user experience in the Internet of Things (IoT) use-cases. However, due to hardware limitations and low computational capabilities, it is almost impossible to execute complex DNN models on IoT devices. To provide a better user experience and mitigate these problems, we introduce split computing technology where the DNN model inference task is partially offloaded from an IoT device to a nearby dependable device called edge. While splitting the DNN between IoT-edge, it is unclear how much the DNN model needs to be partitioned according to varying network conditions to ensure a satisfactory user experience. To address these issues, we propose the following mechanisms that provide a trade-off between the computation and communication, namely: (1) dynamic split computation (DSC) mechanism: finds an optimal partitioning of the DNN model to be executed between IoT and edge to reduce the computation overhead (i.e., overall inference time); (2) reliable communication network switching (RCNS) mechanism: intelligently finds a suitable network to connect to either Wi-Fi/Cellular/Bluetooth to provide reliable data transport between IoT and edge. Further, as the edge devices perform DNN tasks extensively, it is possible that they run out of battery, or get interrupted by high-priority user applications executing on them due to which edge devices cannot complete the assigned DNN task in the desired time. To address these issues, we also propose a load-balancing mechanism between IoT and edge called task load balancing with prioritization (TLBP) model for single and multiple DNN task scenarios. We conduct extensive experiments with Raspberry Pi as an IoT device and Samsung Galaxy S20 smartphones as edge devices to test the proposed mechanism. The results show that the proposed algorithms substantially reduce the inference time of DNN models as compared to the on-device inference time and balance the tasks across the edge devices to minimize excessive battery drainage.
ISSN:2665-9174