MKD: Mixup-Based Knowledge Distillation for Mandarin End-to-End Speech Recognition

Large-scale automatic speech recognition model has achieved impressive performance. However, huge computational resources and massive amount of data are required to train an ASR model. Knowledge distillation is a prevalent model compression method which transfers the knowledge from large model to sm...

Full description

Bibliographic Details
Main Authors: Xing Wu, Yifan Jin, Jianjia Wang, Quan Qian, Yike Guo
Format: Article
Language:English
Published: MDPI AG 2022-05-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/15/5/160