The XMUSPEECH System for Accented English Automatic Speech Recognition
In this paper, we present the XMUSPEECH systems for Track 2 of the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC2020). Track 2 is an Automatic Speech Recognition (ASR) task where the non-native English speakers have various accents, which reduces the accuracy of the ASR syste...
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | English |
Veröffentlicht: |
MDPI AG
2022-01-01
|
Schriftenreihe: | Applied Sciences |
Schlagworte: | |
Online Zugang: | https://www.mdpi.com/2076-3417/12/3/1478 |
_version_ | 1827661570215247872 |
---|---|
author | Fuchuan Tong Tao Li Dexin Liao Shipeng Xia Song Li Qingyang Hong Lin Li |
author_facet | Fuchuan Tong Tao Li Dexin Liao Shipeng Xia Song Li Qingyang Hong Lin Li |
author_sort | Fuchuan Tong |
collection | DOAJ |
description | In this paper, we present the XMUSPEECH systems for Track 2 of the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC2020). Track 2 is an Automatic Speech Recognition (ASR) task where the non-native English speakers have various accents, which reduces the accuracy of the ASR system. To solve this problem, we experimented with acoustic models and input features. Furthermore, we trained a TDNN-LSTM language model for lattice rescoring to obtain better results. Compared with our baseline system, we achieved relative word error rate (WER) improvements of 40.7% and 35.7% on the development set and evaluation set, respectively. |
first_indexed | 2024-03-10T00:12:10Z |
format | Article |
id | doaj.art-82032864c6f743e5b811863a154f5d54 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T00:12:10Z |
publishDate | 2022-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-82032864c6f743e5b811863a154f5d542023-11-23T15:58:16ZengMDPI AGApplied Sciences2076-34172022-01-01123147810.3390/app12031478The XMUSPEECH System for Accented English Automatic Speech RecognitionFuchuan Tong0Tao Li1Dexin Liao2Shipeng Xia3Song Li4Qingyang Hong5Lin Li6School of Electronic Science and Engineering, Xiamen University, Xiamen 361005, ChinaSchool of Informatics, Xiamen University, Xiamen 361005, ChinaSchool of Informatics, Xiamen University, Xiamen 361005, ChinaSchool of Informatics, Xiamen University, Xiamen 361005, ChinaSchool of Electronic Science and Engineering, Xiamen University, Xiamen 361005, ChinaSchool of Informatics, Xiamen University, Xiamen 361005, ChinaSchool of Electronic Science and Engineering, Xiamen University, Xiamen 361005, ChinaIn this paper, we present the XMUSPEECH systems for Track 2 of the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC2020). Track 2 is an Automatic Speech Recognition (ASR) task where the non-native English speakers have various accents, which reduces the accuracy of the ASR system. To solve this problem, we experimented with acoustic models and input features. Furthermore, we trained a TDNN-LSTM language model for lattice rescoring to obtain better results. Compared with our baseline system, we achieved relative word error rate (WER) improvements of 40.7% and 35.7% on the development set and evaluation set, respectively.https://www.mdpi.com/2076-3417/12/3/1478AESRC2020i-vectorx-vectormultistream CNN |
spellingShingle | Fuchuan Tong Tao Li Dexin Liao Shipeng Xia Song Li Qingyang Hong Lin Li The XMUSPEECH System for Accented English Automatic Speech Recognition Applied Sciences AESRC2020 i-vector x-vector multistream CNN |
title | The XMUSPEECH System for Accented English Automatic Speech Recognition |
title_full | The XMUSPEECH System for Accented English Automatic Speech Recognition |
title_fullStr | The XMUSPEECH System for Accented English Automatic Speech Recognition |
title_full_unstemmed | The XMUSPEECH System for Accented English Automatic Speech Recognition |
title_short | The XMUSPEECH System for Accented English Automatic Speech Recognition |
title_sort | xmuspeech system for accented english automatic speech recognition |
topic | AESRC2020 i-vector x-vector multistream CNN |
url | https://www.mdpi.com/2076-3417/12/3/1478 |
work_keys_str_mv | AT fuchuantong thexmuspeechsystemforaccentedenglishautomaticspeechrecognition AT taoli thexmuspeechsystemforaccentedenglishautomaticspeechrecognition AT dexinliao thexmuspeechsystemforaccentedenglishautomaticspeechrecognition AT shipengxia thexmuspeechsystemforaccentedenglishautomaticspeechrecognition AT songli thexmuspeechsystemforaccentedenglishautomaticspeechrecognition AT qingyanghong thexmuspeechsystemforaccentedenglishautomaticspeechrecognition AT linli thexmuspeechsystemforaccentedenglishautomaticspeechrecognition AT fuchuantong xmuspeechsystemforaccentedenglishautomaticspeechrecognition AT taoli xmuspeechsystemforaccentedenglishautomaticspeechrecognition AT dexinliao xmuspeechsystemforaccentedenglishautomaticspeechrecognition AT shipengxia xmuspeechsystemforaccentedenglishautomaticspeechrecognition AT songli xmuspeechsystemforaccentedenglishautomaticspeechrecognition AT qingyanghong xmuspeechsystemforaccentedenglishautomaticspeechrecognition AT linli xmuspeechsystemforaccentedenglishautomaticspeechrecognition |