Attention-based latent features for jointly trained end-to-end automatic speech recognition with modified speech enhancement
In this paper, we propose a joint training framework that efficiently combines time-domain speech enhancement (SE) with an end-to-end (E2E) automatic speech recognition (ASR) system utilizing attention-based latent features. Using the latent feature to train E2E ASR implies that various time-domain...
Main Authors: | Da-Hee Yang, Joon-Hyuk Chang |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-03-01
|
Series: | Journal of King Saud University: Computer and Information Sciences |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1319157823000368 |
Similar Items
-
KsponSpeech: Korean Spontaneous Speech Corpus for Automatic Speech Recognition
by: Jeong-Uk Bang, et al.
Published: (2020-10-01) -
Improving End-to-End Models for Children’s Speech Recognition
by: Tanvina Patel, et al.
Published: (2024-03-01) -
A Bidirectional Context Embedding Transformer for Automatic Speech Recognition
by: Lyuchao Liao, et al.
Published: (2022-01-01) -
A Dual-Channel End-to-End Speech Enhancement Method Using Complex Operations in the Time Domain
by: Jian Pang, et al.
Published: (2023-06-01) -
Fast offline transformer-based end-to-end automatic speech recognition for real-world applications
by: Yoo Rhee Oh, et al.
Published: (2022-06-01)