A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis
TTS (Text-to-Speech) synthesis systems are extensively used across the world to intensify the accessibility of information and to make it possible for the handicapped to be involved directly with computers to get the benefits from this high technology revolution. Various TTS synthesis techniques hav...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Mehran University of Engineering and Technology
2016-07-01
|
Series: | Mehran University Research Journal of Engineering and Technology |
Subjects: | |
Online Access: | http://publications.muet.edu.pk/research_papers/pdf/pdf1365.pdf |
_version_ | 1818233992470593536 |
---|---|
author | MUHAMMAD RIZWAN AHMAD MUHAMMAD JUNAID ARSHAD |
author_facet | MUHAMMAD RIZWAN AHMAD MUHAMMAD JUNAID ARSHAD |
author_sort | MUHAMMAD RIZWAN AHMAD |
collection | DOAJ |
description | TTS (Text-to-Speech) synthesis systems are extensively used across the world to intensify the accessibility of information and to make it possible for the handicapped to be involved directly with computers to get the benefits from this high technology revolution. Various TTS synthesis techniques have been used with their own advantages and limitations. There is not a concatenative synthesis strategy based architecture for Urdu TTS synthesis system for handling the homographs and to avoid the unnatural robot sounding speech produced due the use of di-phones. In this paper, we propose a flexible architecture for Urdu TTS
synthesis system that uses concatenative synthesis strategy because this approach has the ability to join together the small corpus of speech to generate natural and intelligible sound. The main aspiration of this research is to disambiguate the homographs in the Urdu language and to avoid the unnatural robot sounding speech.
Finally, the effectiveness of the system is tested in terms of intelligibility and acceptability on word and sentence level. The intelligibility rate is near to 80% and 65% while acceptability rate for the naturalness is 95% (75% natural, 20% acceptable). |
first_indexed | 2024-12-12T11:30:59Z |
format | Article |
id | doaj.art-a16dbfb32c2c483891736af9ce5175b6 |
institution | Directory Open Access Journal |
issn | 0254-7821 2413-7219 |
language | English |
last_indexed | 2024-12-12T11:30:59Z |
publishDate | 2016-07-01 |
publisher | Mehran University of Engineering and Technology |
record_format | Article |
series | Mehran University Research Journal of Engineering and Technology |
spelling | doaj.art-a16dbfb32c2c483891736af9ce5175b62022-12-22T00:25:47ZengMehran University of Engineering and TechnologyMehran University Research Journal of Engineering and Technology0254-78212413-72192016-07-013533733801365A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech SynthesisMUHAMMAD RIZWAN AHMADMUHAMMAD JUNAID ARSHADTTS (Text-to-Speech) synthesis systems are extensively used across the world to intensify the accessibility of information and to make it possible for the handicapped to be involved directly with computers to get the benefits from this high technology revolution. Various TTS synthesis techniques have been used with their own advantages and limitations. There is not a concatenative synthesis strategy based architecture for Urdu TTS synthesis system for handling the homographs and to avoid the unnatural robot sounding speech produced due the use of di-phones. In this paper, we propose a flexible architecture for Urdu TTS synthesis system that uses concatenative synthesis strategy because this approach has the ability to join together the small corpus of speech to generate natural and intelligible sound. The main aspiration of this research is to disambiguate the homographs in the Urdu language and to avoid the unnatural robot sounding speech. Finally, the effectiveness of the system is tested in terms of intelligibility and acceptability on word and sentence level. The intelligibility rate is near to 80% and 65% while acceptability rate for the naturalness is 95% (75% natural, 20% acceptable).http://publications.muet.edu.pk/research_papers/pdf/pdf1365.pdfArticulatoryText-to-SpeechFormantConcatenativeNatural Language ProcessingWaveformsSpeech UnitsPhonemesSpeech Synthesis. |
spellingShingle | MUHAMMAD RIZWAN AHMAD MUHAMMAD JUNAID ARSHAD A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis Mehran University Research Journal of Engineering and Technology Articulatory Text-to-Speech Formant Concatenative Natural Language Processing Waveforms Speech Units Phonemes Speech Synthesis. |
title | A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis |
title_full | A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis |
title_fullStr | A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis |
title_full_unstemmed | A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis |
title_short | A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis |
title_sort | flexible architecture for urdu phonemes based concatenative speech synthesis |
topic | Articulatory Text-to-Speech Formant Concatenative Natural Language Processing Waveforms Speech Units Phonemes Speech Synthesis. |
url | http://publications.muet.edu.pk/research_papers/pdf/pdf1365.pdf |
work_keys_str_mv | AT muhammadrizwanahmad aflexiblearchitectureforurduphonemesbasedconcatenativespeechsynthesis AT muhammadjunaidarshad aflexiblearchitectureforurduphonemesbasedconcatenativespeechsynthesis AT muhammadrizwanahmad flexiblearchitectureforurduphonemesbasedconcatenativespeechsynthesis AT muhammadjunaidarshad flexiblearchitectureforurduphonemesbasedconcatenativespeechsynthesis |