Blackthorn: Latency Estimation Framework for CNNs on Embedded Nvidia Platforms

With more powerful yet efficient embedded devices and accelerators being available for Deep Neural Networks (DNN), machine learning is becoming an integral part of edge computing. As the number of such devices increases, finding the best platform for a specific application has become more challengin...

Full description

Bibliographic Details
Main Authors: Martin Lechner, Axel Jantsch
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9503415/
_version_ 1797990679704502272
author Martin Lechner
Axel Jantsch
author_facet Martin Lechner
Axel Jantsch
author_sort Martin Lechner
collection DOAJ
description With more powerful yet efficient embedded devices and accelerators being available for Deep Neural Networks (DNN), machine learning is becoming an integral part of edge computing. As the number of such devices increases, finding the best platform for a specific application has become more challenging. A common question for application developers is to find the most cost-effective combination of a DNN and a device while still meeting latency and accuracy requirements. In this work, we propose Blackthorn, a layer-wise latency estimation framework for embedded Nvidia GPUs based on analytical models. We provide accurate predictions for each layer, helping developers to find bottlenecks and optimize the architecture of a DNN to fit target platforms. Our framework can quickly evaluate and compare large amounts of network optimizations without needing to build time-consuming execution engines. Our experimental results on Jetson TX2 and Jetson Nano devices show a per-layer estimation error of 6.104% Root-Mean-Square-Percentage-Error (RMSPE) and 5.888% RMSPE, which significantly outperforms current state-of-the-art methods. At network level, the average latency error is below 3% for the tested DNNs.
first_indexed 2024-04-11T08:40:25Z
format Article
id doaj.art-df5e7bce502d4486b9566620b01c45b2
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T08:40:25Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-df5e7bce502d4486b9566620b01c45b22022-12-22T04:34:14ZengIEEEIEEE Access2169-35362021-01-01911007411008410.1109/ACCESS.2021.31019369503415Blackthorn: Latency Estimation Framework for CNNs on Embedded Nvidia PlatformsMartin Lechner0https://orcid.org/0000-0003-1083-0246Axel Jantsch1https://orcid.org/0000-0003-2251-0004Institute of Computer Technology, TU Wien, Vienna, AustriaInstitute of Computer Technology, TU Wien, Vienna, AustriaWith more powerful yet efficient embedded devices and accelerators being available for Deep Neural Networks (DNN), machine learning is becoming an integral part of edge computing. As the number of such devices increases, finding the best platform for a specific application has become more challenging. A common question for application developers is to find the most cost-effective combination of a DNN and a device while still meeting latency and accuracy requirements. In this work, we propose Blackthorn, a layer-wise latency estimation framework for embedded Nvidia GPUs based on analytical models. We provide accurate predictions for each layer, helping developers to find bottlenecks and optimize the architecture of a DNN to fit target platforms. Our framework can quickly evaluate and compare large amounts of network optimizations without needing to build time-consuming execution engines. Our experimental results on Jetson TX2 and Jetson Nano devices show a per-layer estimation error of 6.104% Root-Mean-Square-Percentage-Error (RMSPE) and 5.888% RMSPE, which significantly outperforms current state-of-the-art methods. At network level, the average latency error is below 3% for the tested DNNs.https://ieeexplore.ieee.org/document/9503415/Artificial neural networksestimationneural network hardware
spellingShingle Martin Lechner
Axel Jantsch
Blackthorn: Latency Estimation Framework for CNNs on Embedded Nvidia Platforms
IEEE Access
Artificial neural networks
estimation
neural network hardware
title Blackthorn: Latency Estimation Framework for CNNs on Embedded Nvidia Platforms
title_full Blackthorn: Latency Estimation Framework for CNNs on Embedded Nvidia Platforms
title_fullStr Blackthorn: Latency Estimation Framework for CNNs on Embedded Nvidia Platforms
title_full_unstemmed Blackthorn: Latency Estimation Framework for CNNs on Embedded Nvidia Platforms
title_short Blackthorn: Latency Estimation Framework for CNNs on Embedded Nvidia Platforms
title_sort blackthorn latency estimation framework for cnns on embedded nvidia platforms
topic Artificial neural networks
estimation
neural network hardware
url https://ieeexplore.ieee.org/document/9503415/
work_keys_str_mv AT martinlechner blackthornlatencyestimationframeworkforcnnsonembeddednvidiaplatforms
AT axeljantsch blackthornlatencyestimationframeworkforcnnsonembeddednvidiaplatforms