The XMUSPEECH System for Accented English Automatic Speech Recognition

In this paper, we present the XMUSPEECH systems for Track 2 of the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC2020). Track 2 is an Automatic Speech Recognition (ASR) task where the non-native English speakers have various accents, which reduces the accuracy of the ASR syste...

Ausführliche Beschreibung

Bibliographische Detailangaben
Hauptverfasser: Fuchuan Tong, Tao Li, Dexin Liao, Shipeng Xia, Song Li, Qingyang Hong, Lin Li
Format: Artikel
Sprache:English
Veröffentlicht: MDPI AG 2022-01-01
Schriftenreihe:Applied Sciences
Schlagworte:
Online Zugang:https://www.mdpi.com/2076-3417/12/3/1478
_version_ 1827661570215247872
author Fuchuan Tong
Tao Li
Dexin Liao
Shipeng Xia
Song Li
Qingyang Hong
Lin Li
author_facet Fuchuan Tong
Tao Li
Dexin Liao
Shipeng Xia
Song Li
Qingyang Hong
Lin Li
author_sort Fuchuan Tong
collection DOAJ
description In this paper, we present the XMUSPEECH systems for Track 2 of the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC2020). Track 2 is an Automatic Speech Recognition (ASR) task where the non-native English speakers have various accents, which reduces the accuracy of the ASR system. To solve this problem, we experimented with acoustic models and input features. Furthermore, we trained a TDNN-LSTM language model for lattice rescoring to obtain better results. Compared with our baseline system, we achieved relative word error rate (WER) improvements of 40.7% and 35.7% on the development set and evaluation set, respectively.
first_indexed 2024-03-10T00:12:10Z
format Article
id doaj.art-82032864c6f743e5b811863a154f5d54
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T00:12:10Z
publishDate 2022-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-82032864c6f743e5b811863a154f5d542023-11-23T15:58:16ZengMDPI AGApplied Sciences2076-34172022-01-01123147810.3390/app12031478The XMUSPEECH System for Accented English Automatic Speech RecognitionFuchuan Tong0Tao Li1Dexin Liao2Shipeng Xia3Song Li4Qingyang Hong5Lin Li6School of Electronic Science and Engineering, Xiamen University, Xiamen 361005, ChinaSchool of Informatics, Xiamen University, Xiamen 361005, ChinaSchool of Informatics, Xiamen University, Xiamen 361005, ChinaSchool of Informatics, Xiamen University, Xiamen 361005, ChinaSchool of Electronic Science and Engineering, Xiamen University, Xiamen 361005, ChinaSchool of Informatics, Xiamen University, Xiamen 361005, ChinaSchool of Electronic Science and Engineering, Xiamen University, Xiamen 361005, ChinaIn this paper, we present the XMUSPEECH systems for Track 2 of the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC2020). Track 2 is an Automatic Speech Recognition (ASR) task where the non-native English speakers have various accents, which reduces the accuracy of the ASR system. To solve this problem, we experimented with acoustic models and input features. Furthermore, we trained a TDNN-LSTM language model for lattice rescoring to obtain better results. Compared with our baseline system, we achieved relative word error rate (WER) improvements of 40.7% and 35.7% on the development set and evaluation set, respectively.https://www.mdpi.com/2076-3417/12/3/1478AESRC2020i-vectorx-vectormultistream CNN
spellingShingle Fuchuan Tong
Tao Li
Dexin Liao
Shipeng Xia
Song Li
Qingyang Hong
Lin Li
The XMUSPEECH System for Accented English Automatic Speech Recognition
Applied Sciences
AESRC2020
i-vector
x-vector
multistream CNN
title The XMUSPEECH System for Accented English Automatic Speech Recognition
title_full The XMUSPEECH System for Accented English Automatic Speech Recognition
title_fullStr The XMUSPEECH System for Accented English Automatic Speech Recognition
title_full_unstemmed The XMUSPEECH System for Accented English Automatic Speech Recognition
title_short The XMUSPEECH System for Accented English Automatic Speech Recognition
title_sort xmuspeech system for accented english automatic speech recognition
topic AESRC2020
i-vector
x-vector
multistream CNN
url https://www.mdpi.com/2076-3417/12/3/1478
work_keys_str_mv AT fuchuantong thexmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT taoli thexmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT dexinliao thexmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT shipengxia thexmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT songli thexmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT qingyanghong thexmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT linli thexmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT fuchuantong xmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT taoli xmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT dexinliao xmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT shipengxia xmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT songli xmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT qingyanghong xmuspeechsystemforaccentedenglishautomaticspeechrecognition
AT linli xmuspeechsystemforaccentedenglishautomaticspeechrecognition