Development of Speech Recognition Systems in Emergency Call Centers

In this paper, various methodologies of acoustic and language models, as well as labeling methods for automatic speech recognition for spoken dialogues in emergency call centers were investigated and comparatively analyzed. Because of the fact that dialogue speech in call centers has specific contex...

Full description

Bibliographic Details
Main Authors:	Alakbar Valizada, Natavan Akhundova, Samir Rustamov
Format:	Article
Language:	English
Published:	MDPI AG 2021-04-01
Series:	Symmetry
Subjects:	speech recognition GMM HMM DNN Kaldi call center
Online Access:	https://www.mdpi.com/2073-8994/13/4/634

_version_	1797538228334493696
author	Alakbar Valizada Natavan Akhundova Samir Rustamov
author_facet	Alakbar Valizada Natavan Akhundova Samir Rustamov
author_sort	Alakbar Valizada
collection	DOAJ
description	In this paper, various methodologies of acoustic and language models, as well as labeling methods for automatic speech recognition for spoken dialogues in emergency call centers were investigated and comparatively analyzed. Because of the fact that dialogue speech in call centers has specific context and noisy, emotional environments, available speech recognition systems show poor performance. Therefore, in order to accurately recognize dialogue speeches, the main modules of speech recognition systems—language models and acoustic training methodologies—as well as symmetric data labeling approaches have been investigated and analyzed. To find an effective acoustic model for dialogue data, different types of Gaussian Mixture Model/Hidden Markov Model (GMM/HMM) and Deep Neural Network/Hidden Markov Model (DNN/HMM) methodologies were trained and compared. Additionally, effective language models for dialogue systems were defined based on extrinsic and intrinsic methods. Lastly, our suggested data labeling approaches with spelling correction are compared with common labeling methods resulting in outperforming the other methods with a notable percentage. Based on the results of the experiments, we determined that DNN/HMM for an acoustic model, trigram with Kneser–Ney discounting for a language model and using spelling correction before training data for a labeling method are effective configurations for dialogue speech recognition in emergency call centers. It should be noted that this research was conducted with two different types of datasets collected from emergency calls: the Dialogue dataset (27 h), which encapsulates call agents’ speech, and the Summary dataset (53 h), which contains voiced summaries of those dialogues describing emergency cases. Even though the speech taken from the emergency call center is in the Azerbaijani language, which belongs to the Turkic group of languages, our approaches are not tightly connected to specific language features. Hence, it is anticipated that suggested approaches can be applied to the other languages of the same group.
first_indexed	2024-03-10T12:27:33Z
format	Article
id	doaj.art-f15b60972a944b4f9ab6ab2aaa422369
institution	Directory Open Access Journal
issn	2073-8994
language	English
last_indexed	2024-03-10T12:27:33Z
publishDate	2021-04-01
publisher	MDPI AG
record_format	Article
series	Symmetry
spelling	doaj.art-f15b60972a944b4f9ab6ab2aaa4223692023-11-21T14:56:17ZengMDPI AGSymmetry2073-89942021-04-0113463410.3390/sym13040634Development of Speech Recognition Systems in Emergency Call CentersAlakbar Valizada0Natavan Akhundova1Samir Rustamov2Artificial Intelligence Laboratory, ATL Tech, Jalil Mammadguluzadeh 102A, Baku 1022, AzerbaijanArtificial Intelligence Laboratory, ATL Tech, Jalil Mammadguluzadeh 102A, Baku 1022, AzerbaijanSchool of Information Technologies and Engineering, ADA University, Ahmadbey Aghaoglu Str. 11, Baku 1008, AzerbaijanIn this paper, various methodologies of acoustic and language models, as well as labeling methods for automatic speech recognition for spoken dialogues in emergency call centers were investigated and comparatively analyzed. Because of the fact that dialogue speech in call centers has specific context and noisy, emotional environments, available speech recognition systems show poor performance. Therefore, in order to accurately recognize dialogue speeches, the main modules of speech recognition systems—language models and acoustic training methodologies—as well as symmetric data labeling approaches have been investigated and analyzed. To find an effective acoustic model for dialogue data, different types of Gaussian Mixture Model/Hidden Markov Model (GMM/HMM) and Deep Neural Network/Hidden Markov Model (DNN/HMM) methodologies were trained and compared. Additionally, effective language models for dialogue systems were defined based on extrinsic and intrinsic methods. Lastly, our suggested data labeling approaches with spelling correction are compared with common labeling methods resulting in outperforming the other methods with a notable percentage. Based on the results of the experiments, we determined that DNN/HMM for an acoustic model, trigram with Kneser–Ney discounting for a language model and using spelling correction before training data for a labeling method are effective configurations for dialogue speech recognition in emergency call centers. It should be noted that this research was conducted with two different types of datasets collected from emergency calls: the Dialogue dataset (27 h), which encapsulates call agents’ speech, and the Summary dataset (53 h), which contains voiced summaries of those dialogues describing emergency cases. Even though the speech taken from the emergency call center is in the Azerbaijani language, which belongs to the Turkic group of languages, our approaches are not tightly connected to specific language features. Hence, it is anticipated that suggested approaches can be applied to the other languages of the same group.https://www.mdpi.com/2073-8994/13/4/634speech recognitionGMMHMMDNNKaldicall center
spellingShingle	Alakbar Valizada Natavan Akhundova Samir Rustamov Development of Speech Recognition Systems in Emergency Call Centers Symmetry speech recognition GMM HMM DNN Kaldi call center
title	Development of Speech Recognition Systems in Emergency Call Centers
title_full	Development of Speech Recognition Systems in Emergency Call Centers
title_fullStr	Development of Speech Recognition Systems in Emergency Call Centers
title_full_unstemmed	Development of Speech Recognition Systems in Emergency Call Centers
title_short	Development of Speech Recognition Systems in Emergency Call Centers
title_sort	development of speech recognition systems in emergency call centers
topic	speech recognition GMM HMM DNN Kaldi call center
url	https://www.mdpi.com/2073-8994/13/4/634
work_keys_str_mv	AT alakbarvalizada developmentofspeechrecognitionsystemsinemergencycallcenters AT natavanakhundova developmentofspeechrecognitionsystemsinemergencycallcenters AT samirrustamov developmentofspeechrecognitionsystemsinemergencycallcenters

Development of Speech Recognition Systems in Emergency Call Centers

Similar Items