ARCHITECTURE OF THE MULTIVOICE TEXT-TO-SPEECH SYSTEM

Architecture of the multimodal text to speech synthesis system based on the voice conversion framework was proposed. Such system could be tuned to the specific speaker without any costs losses on the training phase and based on one speaker base, having in TTS system. Structural scheme for this type...

Full description

Bibliographic Details
Main Authors: V. A. Zakharyeu, A. A. Petrovsky
Format: Article
Language:Russian
Published: Educational institution «Belarusian State University of Informatics and Radioelectronics» 2019-06-01
Series:Doklady Belorusskogo gosudarstvennogo universiteta informatiki i radioèlektroniki
Subjects:
Online Access:https://doklady.bsuir.by/jour/article/view/239
Description
Summary:Architecture of the multimodal text to speech synthesis system based on the voice conversion framework was proposed. Such system could be tuned to the specific speaker without any costs losses on the training phase and based on one speaker base, having in TTS system. Structural scheme for this type of the speech synthesizer, with the description of the functionality of the main blocks were presented. Their specific characteristics are synergy approach to the architecture and text-independent mode in the training phase.
ISSN:1729-7648