W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision

Abstract Non-parallel data voice conversion (VC) has achieved considerable breakthroughs due to self-supervised pre-trained representation (SSPR) being used in recent years. Features extracted by the pre-trained model are expected to contain more content information. However, in common VC with SSPR,...

Full description

Bibliographic Details
Main Authors:	Hao Huang, Lin Wang, Jichen Yang, Ying Hu, Liang He
Format:	Article
Language:	English
Published:	SpringerOpen 2023-10-01
Series:	EURASIP Journal on Audio, Speech, and Music Processing
Subjects:	Voice conversion Self-supervised pre-trained Representation Gradient reversal layer (GRL) CTC
Online Access:	https://doi.org/10.1186/s13636-023-00312-8

Internet

https://doi.org/10.1186/s13636-023-00312-8

W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision

Internet

Similar Items