Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources

Abstract Speech synthesis has made significant strides thanks to the transition from machine learning to deep learning models. Contemporary text-to-speech (TTS) models possess the capability to generate speech of exceptionally high quality, closely mimicking human speech. Nevertheless, given the wid...

Full description

Bibliographic Details
Main Authors: Huda Barakat, Oytun Turk, Cenk Demiroglu
Format: Article
Language:English
Published: SpringerOpen 2024-02-01
Series:EURASIP Journal on Audio, Speech, and Music Processing
Subjects:
Online Access:https://doi.org/10.1186/s13636-024-00329-7