End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2

Text-to-speech (TTS) technology is becoming increasingly popular in various fields such as education and business. However, the advancement of TTS technology for Malay language is slower compared to other language especially English language. The rise of artificial intelligence (AI) technology has...

Full description

Bibliographic Details
Main Authors: Abdul Aziz, Azrul Fahmi, Sabrina Tiun, Sabrina Tiun, Ruslan, Noraini
Format: Article
Language:English
Published: ijacsa 2023
Subjects:
Online Access:http://eprints.uthm.edu.my/10565/1/J16421_03fadd928a98d4594999185deb803d1a.pdf
_version_ 1825638380086493184
author Abdul Aziz, Azrul Fahmi
Sabrina Tiun, Sabrina Tiun
Ruslan, Noraini
author_facet Abdul Aziz, Azrul Fahmi
Sabrina Tiun, Sabrina Tiun
Ruslan, Noraini
author_sort Abdul Aziz, Azrul Fahmi
collection UTHM
description Text-to-speech (TTS) technology is becoming increasingly popular in various fields such as education and business. However, the advancement of TTS technology for Malay language is slower compared to other language especially English language. The rise of artificial intelligence (AI) technology has sparked TTS technology into a new dimension. An end-to-end (E2E) TTS system that generates speech directly from text input is one of the latest AI technologies for TTS and implementing this E2E method into Malay language will help to expand the TTS technology for Malay language. This study involves the development and comparison of two end-to-end TTS models for the Malay language, namely Tacotron and Tacotron 2. Both models were trained using a Malay corpus consisting of text and speech and evaluated the synthesized speech using Mean Opinion Scores (MOS) for naturalness and intelligibility. The results show that Tacotron outperformed Tacotron 2 in terms of naturalness and intelligibility, with both models falling short of human speech quality. Improving TTS technology for Malay can encourage its use in a wider range of contexts.
first_indexed 2024-03-05T22:05:54Z
format Article
id uthm.eprints-10565
institution Universiti Tun Hussein Onn Malaysia
language English
last_indexed 2024-03-05T22:05:54Z
publishDate 2023
publisher ijacsa
record_format dspace
spelling uthm.eprints-105652024-01-03T01:37:13Z http://eprints.uthm.edu.my/10565/ End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2 Abdul Aziz, Azrul Fahmi Sabrina Tiun, Sabrina Tiun Ruslan, Noraini T Technology (General) Text-to-speech (TTS) technology is becoming increasingly popular in various fields such as education and business. However, the advancement of TTS technology for Malay language is slower compared to other language especially English language. The rise of artificial intelligence (AI) technology has sparked TTS technology into a new dimension. An end-to-end (E2E) TTS system that generates speech directly from text input is one of the latest AI technologies for TTS and implementing this E2E method into Malay language will help to expand the TTS technology for Malay language. This study involves the development and comparison of two end-to-end TTS models for the Malay language, namely Tacotron and Tacotron 2. Both models were trained using a Malay corpus consisting of text and speech and evaluated the synthesized speech using Mean Opinion Scores (MOS) for naturalness and intelligibility. The results show that Tacotron outperformed Tacotron 2 in terms of naturalness and intelligibility, with both models falling short of human speech quality. Improving TTS technology for Malay can encourage its use in a wider range of contexts. ijacsa 2023 Article PeerReviewed text en http://eprints.uthm.edu.my/10565/1/J16421_03fadd928a98d4594999185deb803d1a.pdf Abdul Aziz, Azrul Fahmi and Sabrina Tiun, Sabrina Tiun and Ruslan, Noraini (2023) End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2. International Journal of Advanced Computer Science and Applications, 14 (6). pp. 415-431.
spellingShingle T Technology (General)
Abdul Aziz, Azrul Fahmi
Sabrina Tiun, Sabrina Tiun
Ruslan, Noraini
End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2
title End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2
title_full End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2
title_fullStr End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2
title_full_unstemmed End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2
title_short End to End Text to Speech Synthesis for Malay Language using Tacotron and Tacotron 2
title_sort end to end text to speech synthesis for malay language using tacotron and tacotron 2
topic T Technology (General)
url http://eprints.uthm.edu.my/10565/1/J16421_03fadd928a98d4594999185deb803d1a.pdf
work_keys_str_mv AT abdulazizazrulfahmi endtoendtexttospeechsynthesisformalaylanguageusingtacotronandtacotron2
AT sabrinatiunsabrinatiun endtoendtexttospeechsynthesisformalaylanguageusingtacotronandtacotron2
AT ruslannoraini endtoendtexttospeechsynthesisformalaylanguageusingtacotronandtacotron2