Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)

Most state-of-the-art large vocabulary continuous speech recognition systems employ context dependent (CD) phone units, however, the CD phone units are not efficient in capturing long-term spectral dependencies of tone in most tone languages. The Standard Yorùbá (SY) is a language composed of sylla...

Full description

Bibliographic Details
Main Authors:	A.A. Sosimi, T. Adegbola, O.A. Fakinlede
Format:	Article
Language:	English
Published:	Joint Coordination Centre of the World Bank assisted National Agricultural Research Programme (NARP) 2019-06-01
Series:	Journal of Applied Sciences and Environmental Management
Subjects:	Syllabification Standard Yorùbá Context Dependent Tone Tri-tone Recognition
Online Access:	https://www.ajol.info/index.php/jasem/article/view/187585

_version_	1797228006443319296
author	A.A. Sosimi T. Adegbola O.A. Fakinlede
author_facet	A.A. Sosimi T. Adegbola O.A. Fakinlede
author_sort	A.A. Sosimi
collection	DOAJ
description	Most state-of-the-art large vocabulary continuous speech recognition systems employ context dependent (CD) phone units, however, the CD phone units are not efficient in capturing long-term spectral dependencies of tone in most tone languages. The Standard Yorùbá (SY) is a language composed of syllable with tones and requires different method for the acoustic modeling. In this paper, a context dependent tone acoustic model was developed. Tone unit is assumed as syllables, amplitude magnified difference function (AMDF) was used to derive the utterance wide F contour, followed by automatic syllabification and tri-syllable forced alignment with speech phonetization alignment and syllabification SPPAS tool. For classification of the context dependent (CD) tone, slope and intercept of F values were extracted from each segmented unit. Supervised clustering scheme was utilized to partition CD tri-tone based on category and normalized based on some statistics to derive the acoustic feature vectors. Multi-class support vector machine (MSVM) was used for tri-tone training. From the experimental results, it was observed that the word recognition accuracy obtained from the MSVM tri-tone system based on dynamic programming tone embedded features was comparable with phone features. A best parameter tuning was obtained for 10-fold cross validation and overall accuracy was 97.5678%. In term of word error rate (WER), the MSVM CD tri-tone system outperforms the hidden Markov model tri-phone system with WER of 44.47%. Keywords: Syllabification, Standard Yorùbá, Context Dependent Tone, Tri-tone Recognition
first_indexed	2024-04-24T14:49:50Z
format	Article
id	doaj.art-0087a4bda22d49149e5eb1adbe003818
institution	Directory Open Access Journal
issn	2659-1502 2659-1499
language	English
last_indexed	2024-04-24T14:49:50Z
publishDate	2019-06-01
publisher	Joint Coordination Centre of the World Bank assisted National Agricultural Research Programme (NARP)
record_format	Article
series	Journal of Applied Sciences and Environmental Management
spelling	doaj.art-0087a4bda22d49149e5eb1adbe0038182024-04-02T19:50:38ZengJoint Coordination Centre of the World Bank assisted National Agricultural Research Programme (NARP)Journal of Applied Sciences and Environmental Management2659-15022659-14992019-06-0123510.4314/jasem.v23i5.20Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)A.A. SosimiT. AdegbolaO.A. Fakinlede Most state-of-the-art large vocabulary continuous speech recognition systems employ context dependent (CD) phone units, however, the CD phone units are not efficient in capturing long-term spectral dependencies of tone in most tone languages. The Standard Yorùbá (SY) is a language composed of syllable with tones and requires different method for the acoustic modeling. In this paper, a context dependent tone acoustic model was developed. Tone unit is assumed as syllables, amplitude magnified difference function (AMDF) was used to derive the utterance wide F contour, followed by automatic syllabification and tri-syllable forced alignment with speech phonetization alignment and syllabification SPPAS tool. For classification of the context dependent (CD) tone, slope and intercept of F values were extracted from each segmented unit. Supervised clustering scheme was utilized to partition CD tri-tone based on category and normalized based on some statistics to derive the acoustic feature vectors. Multi-class support vector machine (MSVM) was used for tri-tone training. From the experimental results, it was observed that the word recognition accuracy obtained from the MSVM tri-tone system based on dynamic programming tone embedded features was comparable with phone features. A best parameter tuning was obtained for 10-fold cross validation and overall accuracy was 97.5678%. In term of word error rate (WER), the MSVM CD tri-tone system outperforms the hidden Markov model tri-phone system with WER of 44.47%. Keywords: Syllabification, Standard Yorùbá, Context Dependent Tone, Tri-tone Recognition https://www.ajol.info/index.php/jasem/article/view/187585SyllabificationStandard YorùbáContext Dependent ToneTri-tone Recognition
spellingShingle	A.A. Sosimi T. Adegbola O.A. Fakinlede Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM) Journal of Applied Sciences and Environmental Management Syllabification Standard Yorùbá Context Dependent Tone Tri-tone Recognition
title	Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)
title_full	Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)
title_fullStr	Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)
title_full_unstemmed	Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)
title_short	Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)
title_sort	standard yoruba context dependent tone identification using multi class support vector machine msvm
topic	Syllabification Standard Yorùbá Context Dependent Tone Tri-tone Recognition
url	https://www.ajol.info/index.php/jasem/article/view/187585
work_keys_str_mv	AT aasosimi standardyorubacontextdependenttoneidentificationusingmulticlasssupportvectormachinemsvm AT tadegbola standardyorubacontextdependenttoneidentificationusingmulticlasssupportvectormachinemsvm AT oafakinlede standardyorubacontextdependenttoneidentificationusingmulticlasssupportvectormachinemsvm

Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)

Similar Items