Latency Estimation Tool and Investigation of Neural Networks Inference on Mobile GPU
A lot of deep learning applications are desired to be run on mobile devices. Both accuracy and inference time are meaningful for a lot of them. While the number of FLOPs is usually used as a proxy for neural network latency, it may not be the best choice. In order to obtain a better approximation of...
Main Authors: | Evgeny Ponomarev, Sergey Matveev, Ivan Oseledets, Valery Glukhov |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-08-01
|
Series: | Computers |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-431X/10/8/104 |
Similar Items
-
Latency-aware automatic CNN channel pruning with GPU runtime analysis
by: Jiaqiang Liu, et al.
Published: (2021-10-01) -
Inference Latency Prediction Approaches Using Statistical Information for Object Detection in Edge Computing
by: Gyuyeol Kong, et al.
Published: (2023-08-01) -
FLIA: Architecture of Collaborated Mobile GPU and FPGA Heterogeneous Computing
by: Nan Hu, et al.
Published: (2022-11-01) -
Cost Efficient GPU Cluster Management for Training and Inference of Deep Learning
by: Dong-Ki Kang, et al.
Published: (2022-01-01) -
Building Modern GPU Brute-Force Collision Resistible Hash Algorithm
by: L. A. Nadeinsky
Published: (2012-03-01)