Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference
Program execution on intermittently powered Internet-of-Things (IoT) devices must ensure forward progress in the presence of frequent power failures. A general solution is intermittent computing, by which the program states are frequently checkpointed to non-volatile memory (NVM) so that once a powe...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9874742/ |
_version_ | 1811205415169425408 |
---|---|
author | Mingsong Lv Enyu Xu |
author_facet | Mingsong Lv Enyu Xu |
author_sort | Mingsong Lv |
collection | DOAJ |
description | Program execution on intermittently powered Internet-of-Things (IoT) devices must ensure forward progress in the presence of frequent power failures. A general solution is intermittent computing, by which the program states are frequently checkpointed to non-volatile memory (NVM) so that once a power failure occurs, the program can restart from the latest checkpoint after the system energy regains. However, executing a deep neural network (DNN) inference program in an intermittent way has a big problem. During the execution, an inference program will generate large-volume feature maps (as part of the program states), and checkpointing the feature maps to NVM will incur significant time and energy overhead and thus reduce the inference efficiency. This paper proposes an approach to reduce the amount of feature map writing in intermittent DNN inference. The main idea is to partition the inference task into several slices and execute each slice in a depth-first manner so that the intermediate feature maps during the inference of each slice do not need to be written to NVM. Extensive experiments have been conducted, which show that the proposed approach can significantly reduce the amount of NVM writing, and a maximum of 1.965 speedup of the total inference time is achieved compared to the state-of-the-art approach. |
first_indexed | 2024-04-12T03:32:17Z |
format | Article |
id | doaj.art-5cc6c69ef5d74ec8b09c0d5433ce15cd |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-12T03:32:17Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-5cc6c69ef5d74ec8b09c0d5433ce15cd2022-12-22T03:49:32ZengIEEEIEEE Access2169-35362022-01-011010199910200810.1109/ACCESS.2022.32037199874742Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First InferenceMingsong Lv0https://orcid.org/0000-0002-4489-745XEnyu Xu1International Laboratory for Smart Systems, Northeastern University, Shenyang, ChinaInternational Laboratory for Smart Systems, Northeastern University, Shenyang, ChinaProgram execution on intermittently powered Internet-of-Things (IoT) devices must ensure forward progress in the presence of frequent power failures. A general solution is intermittent computing, by which the program states are frequently checkpointed to non-volatile memory (NVM) so that once a power failure occurs, the program can restart from the latest checkpoint after the system energy regains. However, executing a deep neural network (DNN) inference program in an intermittent way has a big problem. During the execution, an inference program will generate large-volume feature maps (as part of the program states), and checkpointing the feature maps to NVM will incur significant time and energy overhead and thus reduce the inference efficiency. This paper proposes an approach to reduce the amount of feature map writing in intermittent DNN inference. The main idea is to partition the inference task into several slices and execute each slice in a depth-first manner so that the intermediate feature maps during the inference of each slice do not need to be written to NVM. Extensive experiments have been conducted, which show that the proposed approach can significantly reduce the amount of NVM writing, and a maximum of 1.965 speedup of the total inference time is achieved compared to the state-of-the-art approach.https://ieeexplore.ieee.org/document/9874742/Intermittent computingDNN inferenceIoT devicesembedded systems |
spellingShingle | Mingsong Lv Enyu Xu Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference IEEE Access Intermittent computing DNN inference IoT devices embedded systems |
title | Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference |
title_full | Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference |
title_fullStr | Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference |
title_full_unstemmed | Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference |
title_short | Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference |
title_sort | efficient dnn execution on intermittently powered iot devices with depth first inference |
topic | Intermittent computing DNN inference IoT devices embedded systems |
url | https://ieeexplore.ieee.org/document/9874742/ |
work_keys_str_mv | AT mingsonglv efficientdnnexecutiononintermittentlypowerediotdeviceswithdepthfirstinference AT enyuxu efficientdnnexecutiononintermittentlypowerediotdeviceswithdepthfirstinference |