HierTTS: Expressive End-to-End Text-to-Waveform Using a Multi-Scale Hierarchical Variational Auto-Encoder

End-to-end text-to-speech (TTS) models that directly generate waveforms from text are gaining popularity. However, existing end-to-end models are still not natural enough in their prosodic expressiveness. Additionally, previous studies on improving the expressiveness of TTS have mainly focused on ac...

Full description

Bibliographic Details
Main Authors: Zengqiang Shang, Peiyang Shi, Pengyuan Zhang, Li Wang, Guangying Zhao
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/2/868