Finite-state super transducers for compact language resource representation in edge voice-AI

Finite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction metho...

Full description

Bibliographic Details
Main Authors:	Simon Dobrišek, Žiga Golob, Jerneja Žganec Gros
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2022-12-01
Series:	Systems Science & Control Engineering
Subjects:	Speech synthesis pronunciation dictionary finite-state super transducers automatic grapheme-to-phoneme conversion
Online Access:	https://www.tandfonline.com/doi/10.1080/21642583.2022.2089930

_version_	1818550372025761792
author	Simon Dobrišek Žiga Golob Jerneja Žganec Gros
author_facet	Simon Dobrišek Žiga Golob Jerneja Žganec Gros
author_sort	Simon Dobrišek
collection	DOAJ
description	Finite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction methods are needed. In the paper, we demonstrate that the size of finite-state transducers tends to decrease when the number of word forms in the modelled pronunciation dictionary reaches a certain threshold. Motivated by this finding, we propose and evaluate a new type of finite-state transducers, called ‘finite-state super transducers’, which allow for the representation of pronunciation dictionaries by a smaller number of states and transitions, thereby significantly reducing the size of the language resource representation in comparison to minimal deterministic final-state transducers by up to 25%. Further, we demonstrate that finite-state super transducers exhibit a generalization capability as they may accept and thereby phonetically transform even inflected word forms that had not been initially represented in the original pronunciation dictionary used for building the finite-state super transducer. This method is suitable for speech engines operating on platforms at the edge of an AI system with restricted memory capabilities and processing power, where efficient speech processing methods based on compact language resources must be implemented.
first_indexed	2024-12-12T08:45:34Z
format	Article
id	doaj.art-fe9c98f3172c45799e319405d114ae2d
institution	Directory Open Access Journal
issn	2164-2583
language	English
last_indexed	2024-12-12T08:45:34Z
publishDate	2022-12-01
publisher	Taylor & Francis Group
record_format	Article
series	Systems Science & Control Engineering
spelling	doaj.art-fe9c98f3172c45799e319405d114ae2d2022-12-22T00:30:33ZengTaylor & Francis GroupSystems Science & Control Engineering2164-25832022-12-0110163664410.1080/21642583.2022.2089930Finite-state super transducers for compact language resource representation in edge voice-AISimon Dobrišek0Žiga Golob1Jerneja Žganec Gros2Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, SloveniaAlpineon Research and Development Ltd., Ljubljana, SloveniaAlpineon Research and Development Ltd., Ljubljana, SloveniaFinite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction methods are needed. In the paper, we demonstrate that the size of finite-state transducers tends to decrease when the number of word forms in the modelled pronunciation dictionary reaches a certain threshold. Motivated by this finding, we propose and evaluate a new type of finite-state transducers, called ‘finite-state super transducers’, which allow for the representation of pronunciation dictionaries by a smaller number of states and transitions, thereby significantly reducing the size of the language resource representation in comparison to minimal deterministic final-state transducers by up to 25%. Further, we demonstrate that finite-state super transducers exhibit a generalization capability as they may accept and thereby phonetically transform even inflected word forms that had not been initially represented in the original pronunciation dictionary used for building the finite-state super transducer. This method is suitable for speech engines operating on platforms at the edge of an AI system with restricted memory capabilities and processing power, where efficient speech processing methods based on compact language resources must be implemented.https://www.tandfonline.com/doi/10.1080/21642583.2022.2089930Speech synthesispronunciation dictionaryfinite-state super transducersautomatic grapheme-to-phoneme conversion
spellingShingle	Simon Dobrišek Žiga Golob Jerneja Žganec Gros Finite-state super transducers for compact language resource representation in edge voice-AI Systems Science & Control Engineering Speech synthesis pronunciation dictionary finite-state super transducers automatic grapheme-to-phoneme conversion
title	Finite-state super transducers for compact language resource representation in edge voice-AI
title_full	Finite-state super transducers for compact language resource representation in edge voice-AI
title_fullStr	Finite-state super transducers for compact language resource representation in edge voice-AI
title_full_unstemmed	Finite-state super transducers for compact language resource representation in edge voice-AI
title_short	Finite-state super transducers for compact language resource representation in edge voice-AI
title_sort	finite state super transducers for compact language resource representation in edge voice ai
topic	Speech synthesis pronunciation dictionary finite-state super transducers automatic grapheme-to-phoneme conversion
url	https://www.tandfonline.com/doi/10.1080/21642583.2022.2089930
work_keys_str_mv	AT simondobrisek finitestatesupertransducersforcompactlanguageresourcerepresentationinedgevoiceai AT zigagolob finitestatesupertransducersforcompactlanguageresourcerepresentationinedgevoiceai AT jernejazganecgros finitestatesupertransducersforcompactlanguageresourcerepresentationinedgevoiceai

Finite-state super transducers for compact language resource representation in edge voice-AI

Similar Items