Latency Estimation Tool and Investigation of Neural Networks Inference on Mobile GPU

Latency Estimation Tool and Investigation of Neural Networks Inference on Mobile GPU

A lot of deep learning applications are desired to be run on mobile devices. Both accuracy and inference time are meaningful for a lot of them. While the number of FLOPs is usually used as a proxy for neural network latency, it may not be the best choice. In order to obtain a better approximation of...

Full description

Bibliographic Details
Main Authors:	Evgeny Ponomarev, Sergey Matveev, Ivan Oseledets, Valery Glukhov
Format:	Article
Language:	English
Published:	MDPI AG 2021-08-01
Series:	Computers
Subjects:	latency inference mobile GPU neural architecture search
Online Access:	https://www.mdpi.com/2073-431X/10/8/104

Similar Items

Latency-aware automatic CNN channel pruning with GPU runtime analysis
by: Jiaqiang Liu, et al.
Published: (2021-10-01)

Inference Latency Prediction Approaches Using Statistical Information for Object Detection in Edge Computing
by: Gyuyeol Kong, et al.
Published: (2023-08-01)

FLIA: Architecture of Collaborated Mobile GPU and FPGA Heterogeneous Computing
by: Nan Hu, et al.
Published: (2022-11-01)

Cost Efficient GPU Cluster Management for Training and Inference of Deep Learning
by: Dong-Ki Kang, et al.
Published: (2022-01-01)

Building Modern GPU Brute-Force Collision Resistible Hash Algorithm
by: L. A. Nadeinsky
Published: (2012-03-01)

Latency-Constrained Neural Architecture Search Method for Efficient Model Deployment on RISC-V Devices
by: Mingxi Xiang, et al.
Published: (2024-02-01)

vFirelib: A GPU-based fire simulation and visualization tool
by: Rui Wu, et al.
Published: (2023-07-01)

Network Motif Discovery: A GPU Approach
by: Lin, Wenqing, et al.
Published: (2017)

GPU-Based Embedded Intelligence Architectures and Applications
by: Li Minn Ang, et al.
Published: (2021-04-01)

A GPU-based tabu search for very large hardware/software partitioning with limited resource usage
by: Neng HOU, et al.
Published: (2017-11-01)

Embedding GPU Computations in Hadoop
by: Jie Zhu, et al.
Published: (2014-11-01)

Analyzing GCN Aggregation on GPU
by: Inje Kim, et al.
Published: (2022-01-01)

GPU Hızlandırmalı Veri Demetleme Algoritmalarının İncelenmesi
by: Murat Hacıömeroğlu, et al.
Published: (2013-04-01)

Evaluation of Pseudo-Random Number Generation on GPU Cards
by: Tair Askar, et al.
Published: (2021-12-01)

A novel GPU based Geo-Location Inference Attack on WebGL framework
by: Weixian Mai, et al.
Published: (2023-12-01)

Energy-Optimal Latency-Constrained Application Offloading in Mobile-Edge Computing
by: Xiaohui Gu, et al.
Published: (2020-05-01)

Parallel Dislocation Model Implementation for Earthquake Source Parameter Estimation on Multi-Threaded GPU
by: Seongjae Lee, et al.
Published: (2021-10-01)

Analyzing Data Locality on GPU Caches Using Static Profiling of Workloads
by: Jieun Kim, et al.
Published: (2023-01-01)

GPGPU Task Scheduling Technique for Reducing the Performance Deviation of Multiple GPGPU Tasks in RPC-Based GPU Virtualization Environments
by: Jihun Kang, et al.
Published: (2021-03-01)

HIV-1 Latency and Viral Reservoirs: Existing Reversal Approaches and Potential Technologies, Targets, and Pathways Involved in HIV Latency Studies
by: Sushant Khanal, et al.
Published: (2021-02-01)

GPU Accelerated Processing Method for Feature Point Extraction and Matching in Satellite SAR Images
by: Lei Dong, et al.
Published: (2024-02-01)

The Design and Implementation of an Improved Lightweight BLASTP on CUDA GPU
by: Xue Sun, et al.
Published: (2021-12-01)

Node-to-Node Realization of Meshless Local Petrov Galerkin (MLPG) Fully in GPU
by: Lucas Pantuza Amorim, et al.
Published: (2019-01-01)

Finite Difference Time-Domain Modelling of Metamaterials: GPU Implementation of Cylindrical Cloak
by: A. Dawood
Published: (2013-08-01)

A metaheuristic optimization algorithm for multimodal benchmark function in a GPU architecture
by: Javier Luis Mroginski, et al.
Published: (2018-09-01)

Hybrid-Smash: A Heterogeneous CPU-GPU Compression Library
by: Cristian Penaranda, et al.
Published: (2024-01-01)

Forecasting of GPU Prices Using Transformer Method
by: Risyad Faisal Hadi, et al.
Published: (2023-03-01)

Multi-Gbps LDPC Decoder on GPU Devices
by: Jingxin Dai, et al.
Published: (2022-10-01)

CPU-GPU-Memory DVFS for Power-Efficient MPSoC in Mobile Cyber Physical Systems
by: Somdip Dey, et al.
Published: (2022-03-01)

Graph Processing Scheme Using GPU With Value-Driven Differential Scheduling
by: Sangho Song, et al.
Published: (2024-01-01)

XUnified: A Framework for Guiding Optimal Use of GPU Unified Memory
by: Hailu Xu, et al.
Published: (2022-01-01)

Optimizing the sparse approximate inverse preconditioning algorithm on GPU
by: Xinyue Chu, et al.
Published: (2022-10-01)

Experimental Systems for Measuring HIV Latency and Reactivation
by: Koh Fujinaga, et al.
Published: (2020-11-01)

Goten: GPU-outsourcing trusted execution of neural network training
by: Ng, Lucian K. L., et al.
Published: (2022)

Benchmarking GPU Tensor Cores on General Matrix Multiplication Kernels through CUTLASS
by: Xuanteng Huang, et al.
Published: (2023-12-01)

GPU Accelerated Variation after Projection Calculation
by: LU Xiao, LIAN Zhanjiang, GAO Zaochun
Published: (2024-02-01)

Turbomachinery GPU Accelerated CFD: An Insight into Performance
by: Daniel Molinero-Hernández, et al.
Published: (2024-03-01)

Accurate Energy and Performance Prediction for Frequency-Scaled GPU Kernels
by: Kaijie Fan, et al.
Published: (2020-04-01)

GPU based approach for fast generation of robot capability representations
by: Daniel García Vaglio, et al.
Published: (2022-11-01)

gMSR: A Multi-GPU Algorithm to Accelerate a Massive Validation of Biclusters
by: Aurelio López-Fernández, et al.
Published: (2020-10-01)