FST-Based Pronunciation Lexicon Compression for Speech Engines
Finite-state transducers are frequently used for pronunciation lexicon representations in speech engines, in which memory and processing resources are scarce. This paper proposes two possibilities for further reducing the memory footprint of finite-state transducers representing pronunciation lexico...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2012-11-01
|
Series: | International Journal of Advanced Robotic Systems |
Online Access: | https://doi.org/10.5772/52795 |
_version_ | 1818576678314573824 |
---|---|
author | Žiga Golob Jerneja Žganec Gros Mario Žganec Boštjan Vesnicer Simon Dobrišek |
author_facet | Žiga Golob Jerneja Žganec Gros Mario Žganec Boštjan Vesnicer Simon Dobrišek |
author_sort | Žiga Golob |
collection | DOAJ |
description | Finite-state transducers are frequently used for pronunciation lexicon representations in speech engines, in which memory and processing resources are scarce. This paper proposes two possibilities for further reducing the memory footprint of finite-state transducers representing pronunciation lexicons. First, different alignments of grapheme and allophone transcriptions are studied and a reduction in the number of states of up to 30% is reported. Second, a combination of grapheme-to-allophone rules with a finite-state transducer is proposed, which yields a 65% smaller finite-state transducer than conventional approaches. |
first_indexed | 2024-12-16T06:17:50Z |
format | Article |
id | doaj.art-437c63975fe64da38f26bd43fe9c0e06 |
institution | Directory Open Access Journal |
issn | 1729-8814 |
language | English |
last_indexed | 2024-12-16T06:17:50Z |
publishDate | 2012-11-01 |
publisher | SAGE Publishing |
record_format | Article |
series | International Journal of Advanced Robotic Systems |
spelling | doaj.art-437c63975fe64da38f26bd43fe9c0e062022-12-21T22:41:12ZengSAGE PublishingInternational Journal of Advanced Robotic Systems1729-88142012-11-01910.5772/5279510.5772_52795FST-Based Pronunciation Lexicon Compression for Speech EnginesŽiga Golob0Jerneja Žganec Gros1Mario Žganec2Boštjan Vesnicer3Simon Dobrišek4 Alpineon Research and Development Ltd., Ljubljana, Slovenia Alpineon Research and Development Ltd., Ljubljana, Slovenia Alpineon Research and Development Ltd., Ljubljana, Slovenia Alpineon Research and Development Ltd., Ljubljana, Slovenia Faculty of Electrical Engineering, University of Ljubljana, SloveniaFinite-state transducers are frequently used for pronunciation lexicon representations in speech engines, in which memory and processing resources are scarce. This paper proposes two possibilities for further reducing the memory footprint of finite-state transducers representing pronunciation lexicons. First, different alignments of grapheme and allophone transcriptions are studied and a reduction in the number of states of up to 30% is reported. Second, a combination of grapheme-to-allophone rules with a finite-state transducer is proposed, which yields a 65% smaller finite-state transducer than conventional approaches.https://doi.org/10.5772/52795 |
spellingShingle | Žiga Golob Jerneja Žganec Gros Mario Žganec Boštjan Vesnicer Simon Dobrišek FST-Based Pronunciation Lexicon Compression for Speech Engines International Journal of Advanced Robotic Systems |
title | FST-Based Pronunciation Lexicon Compression for Speech Engines |
title_full | FST-Based Pronunciation Lexicon Compression for Speech Engines |
title_fullStr | FST-Based Pronunciation Lexicon Compression for Speech Engines |
title_full_unstemmed | FST-Based Pronunciation Lexicon Compression for Speech Engines |
title_short | FST-Based Pronunciation Lexicon Compression for Speech Engines |
title_sort | fst based pronunciation lexicon compression for speech engines |
url | https://doi.org/10.5772/52795 |
work_keys_str_mv | AT zigagolob fstbasedpronunciationlexiconcompressionforspeechengines AT jernejazganecgros fstbasedpronunciationlexiconcompressionforspeechengines AT mariozganec fstbasedpronunciationlexiconcompressionforspeechengines AT bostjanvesnicer fstbasedpronunciationlexiconcompressionforspeechengines AT simondobrisek fstbasedpronunciationlexiconcompressionforspeechengines |