Sequence-to-Sequence Acoustic Modeling with Semi-Stepwise Monotonic Attention for Speech Synthesis

An encoder–decoder with attention has become a popular method to achieve sequence-to-sequence (Seq2Seq) acoustic modeling for speech synthesis. To improve the robustness of the attention mechanism, methods utilizing the monotonic alignment between phone sequences and acoustic feature sequences have...

Full description

Bibliographic Details
Main Authors: Xiao Zhou, Zhenhua Ling, Yajun Hu, Lirong Dai
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/21/10475