Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources
Abstract Speech synthesis has made significant strides thanks to the transition from machine learning to deep learning models. Contemporary text-to-speech (TTS) models possess the capability to generate speech of exceptionally high quality, closely mimicking human speech. Nevertheless, given the wid...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2024-02-01
|
Series: | EURASIP Journal on Audio, Speech, and Music Processing |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13636-024-00329-7 |