End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2
Text-to-speech (TTS) technology is becoming increasingly popular in various fields such as education and business. However, the advancement of TTS technology for Malay language is slower compared to other language especially English language. The rise of artificial intelligence (AI) technology has...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
ijacsa
2023
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/10565/1/J16421_03fadd928a98d4594999185deb803d1a.pdf |
_version_ | 1825638380086493184 |
---|---|
author | Abdul Aziz, Azrul Fahmi Sabrina Tiun, Sabrina Tiun Ruslan, Noraini |
author_facet | Abdul Aziz, Azrul Fahmi Sabrina Tiun, Sabrina Tiun Ruslan, Noraini |
author_sort | Abdul Aziz, Azrul Fahmi |
collection | UTHM |
description | Text-to-speech (TTS) technology is becoming increasingly popular in various fields such as education and business. However, the advancement of TTS technology for Malay language is slower compared to other language especially
English language. The rise of artificial intelligence (AI)
technology has sparked TTS technology into a new dimension. An end-to-end (E2E) TTS system that generates speech directly from text input is one of the latest AI technologies for TTS and implementing this E2E method into Malay language will help to expand the TTS technology for Malay language. This study involves the development and comparison of two end-to-end TTS models for the Malay language, namely Tacotron and Tacotron 2. Both models were trained using a Malay corpus consisting of text and speech and evaluated the synthesized speech using Mean Opinion Scores (MOS) for naturalness and intelligibility. The results show that Tacotron outperformed Tacotron 2 in terms of
naturalness and intelligibility, with both models falling short of human speech quality. Improving TTS technology for Malay can encourage its use in a wider range of contexts. |
first_indexed | 2024-03-05T22:05:54Z |
format | Article |
id | uthm.eprints-10565 |
institution | Universiti Tun Hussein Onn Malaysia |
language | English |
last_indexed | 2024-03-05T22:05:54Z |
publishDate | 2023 |
publisher | ijacsa |
record_format | dspace |
spelling | uthm.eprints-105652024-01-03T01:37:13Z http://eprints.uthm.edu.my/10565/ End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2 Abdul Aziz, Azrul Fahmi Sabrina Tiun, Sabrina Tiun Ruslan, Noraini T Technology (General) Text-to-speech (TTS) technology is becoming increasingly popular in various fields such as education and business. However, the advancement of TTS technology for Malay language is slower compared to other language especially English language. The rise of artificial intelligence (AI) technology has sparked TTS technology into a new dimension. An end-to-end (E2E) TTS system that generates speech directly from text input is one of the latest AI technologies for TTS and implementing this E2E method into Malay language will help to expand the TTS technology for Malay language. This study involves the development and comparison of two end-to-end TTS models for the Malay language, namely Tacotron and Tacotron 2. Both models were trained using a Malay corpus consisting of text and speech and evaluated the synthesized speech using Mean Opinion Scores (MOS) for naturalness and intelligibility. The results show that Tacotron outperformed Tacotron 2 in terms of naturalness and intelligibility, with both models falling short of human speech quality. Improving TTS technology for Malay can encourage its use in a wider range of contexts. ijacsa 2023 Article PeerReviewed text en http://eprints.uthm.edu.my/10565/1/J16421_03fadd928a98d4594999185deb803d1a.pdf Abdul Aziz, Azrul Fahmi and Sabrina Tiun, Sabrina Tiun and Ruslan, Noraini (2023) End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2. International Journal of Advanced Computer Science and Applications, 14 (6). pp. 415-431. |
spellingShingle | T Technology (General) Abdul Aziz, Azrul Fahmi Sabrina Tiun, Sabrina Tiun Ruslan, Noraini End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2 |
title | End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2 |
title_full | End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2 |
title_fullStr | End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2 |
title_full_unstemmed | End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2 |
title_short | End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2 |
title_sort | end to end text to speech synthesis for malay language using tacotron and tacotron 2 |
topic | T Technology (General) |
url | http://eprints.uthm.edu.my/10565/1/J16421_03fadd928a98d4594999185deb803d1a.pdf |
work_keys_str_mv | AT abdulazizazrulfahmi endtoendtexttospeechsynthesisformalaylanguageusingtacotronandtacotron2 AT sabrinatiunsabrinatiun endtoendtexttospeechsynthesisformalaylanguageusingtacotronandtacotron2 AT ruslannoraini endtoendtexttospeechsynthesisformalaylanguageusingtacotronandtacotron2 |