How does language model size effects speech recognition accuracy for the Turkish language?

In this paper we aimed at investigating the effect of Language Model (LM) size on Speech Recognition (SR) accuracy. We also provided details of our approach for obtaining the LM for Turkish. Since LM is obtained by statistical processing of raw text, we expect that by increasing the size of availabl...

Full description

Bibliographic Details
Main Authors:	Behnam Asefisaray, Erhan Mengüşoğlu, Hayri Sever, Murat Hacıömeroğlu
Format:	Article
Language:	English
Published:	Pamukkale University 2016-05-01
Series:	Pamukkale University Journal of Engineering Sciences
Subjects:	- dil modeli ses tanıma sistemleri dil modeli ağırlığı aktif token sayısı
Online Access:	https://dergipark.org.tr/tr/pub/pajes/issue/20566/219179

_version_	1797914598708346880
author	Behnam Asefisaray Erhan Mengüşoğlu Hayri Sever Murat Hacıömeroğlu
author_facet	Behnam Asefisaray Erhan Mengüşoğlu Hayri Sever Murat Hacıömeroğlu
author_sort	Behnam Asefisaray
collection	DOAJ
description	In this paper we aimed at investigating the effect of Language Model (LM) size on Speech Recognition (SR) accuracy. We also provided details of our approach for obtaining the LM for Turkish. Since LM is obtained by statistical processing of raw text, we expect that by increasing the size of available data for training the LM, SR accuracy will improve. Since this study is based on recognition of Turkish, which is a highly agglutinative language, it is important to find out the appropriate size for the training data. The minimum required data size is expected to be much higher than the data needed to train a language model for a language with low level of agglutination such as English. In the experiments we also tried to adjust the Language Model Weight (LMW) and Active Token Count (ATC) parameters of LM as these are expected to be different for a highly agglutinative language. We showed that by increasing the training data size to an appropriate level, the recognition accuracy improved on the other hand changes on LMW and ATC did not have a positive effect on Turkish speech recognition accuracy.
first_indexed	2024-04-10T12:28:57Z
format	Article
id	doaj.art-79bb4dadafee4db291ed50c423977927
institution	Directory Open Access Journal
issn	1300-7009 2147-5881
language	English
last_indexed	2024-04-10T12:28:57Z
publishDate	2016-05-01
publisher	Pamukkale University
record_format	Article
series	Pamukkale University Journal of Engineering Sciences
spelling	doaj.art-79bb4dadafee4db291ed50c4239779272023-02-15T16:15:00ZengPamukkale UniversityPamukkale University Journal of Engineering Sciences1300-70092147-58812016-05-01222100105218How does language model size effects speech recognition accuracy for the Turkish language?Behnam AsefisarayErhan MengüşoğluHayri SeverMurat HacıömeroğluIn this paper we aimed at investigating the effect of Language Model (LM) size on Speech Recognition (SR) accuracy. We also provided details of our approach for obtaining the LM for Turkish. Since LM is obtained by statistical processing of raw text, we expect that by increasing the size of available data for training the LM, SR accuracy will improve. Since this study is based on recognition of Turkish, which is a highly agglutinative language, it is important to find out the appropriate size for the training data. The minimum required data size is expected to be much higher than the data needed to train a language model for a language with low level of agglutination such as English. In the experiments we also tried to adjust the Language Model Weight (LMW) and Active Token Count (ATC) parameters of LM as these are expected to be different for a highly agglutinative language. We showed that by increasing the training data size to an appropriate level, the recognition accuracy improved on the other hand changes on LMW and ATC did not have a positive effect on Turkish speech recognition accuracy.https://dergipark.org.tr/tr/pub/pajes/issue/20566/219179-dil modeli ses tanıma sistemleri dil modeli ağırlığı aktif token sayısı
spellingShingle	Behnam Asefisaray Erhan Mengüşoğlu Hayri Sever Murat Hacıömeroğlu How does language model size effects speech recognition accuracy for the Turkish language? Pamukkale University Journal of Engineering Sciences - dil modeli ses tanıma sistemleri dil modeli ağırlığı aktif token sayısı
title	How does language model size effects speech recognition accuracy for the Turkish language?
title_full	How does language model size effects speech recognition accuracy for the Turkish language?
title_fullStr	How does language model size effects speech recognition accuracy for the Turkish language?
title_full_unstemmed	How does language model size effects speech recognition accuracy for the Turkish language?
title_short	How does language model size effects speech recognition accuracy for the Turkish language?
title_sort	how does language model size effects speech recognition accuracy for the turkish language
topic	- dil modeli ses tanıma sistemleri dil modeli ağırlığı aktif token sayısı
url	https://dergipark.org.tr/tr/pub/pajes/issue/20566/219179
work_keys_str_mv	AT behnamasefisaray howdoeslanguagemodelsizeeffectsspeechrecognitionaccuracyfortheturkishlanguage AT erhanmengusoglu howdoeslanguagemodelsizeeffectsspeechrecognitionaccuracyfortheturkishlanguage AT hayrisever howdoeslanguagemodelsizeeffectsspeechrecognitionaccuracyfortheturkishlanguage AT murathacıomeroglu howdoeslanguagemodelsizeeffectsspeechrecognitionaccuracyfortheturkishlanguage

How does language model size effects speech recognition accuracy for the Turkish language?

Similar Items