Benchmarking Inference of Transformer-Based Transcription Models With Clustering on Embedded GPUs
Early awareness of inference performance ensures the feasibility of machine learning for embedded deployment. Often, ML model selection often focuses first on training performance and accuracy, with inference considered second. While prioritizing training is necessary, model inference performance is...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10595070/ |