Mixture of factor analyzers using priors from non-parallel speech for voice conversion

A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from n...

Full description

Bibliographic Details
Main Authors:	Wu, Zhizheng, Kinnunen, Tomi, Chng, Eng Siong, Li, Haizhou
Other Authors:	School of Computer Engineering
Format:	Journal Article
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/102726 http://hdl.handle.net/10220/16436

_version_	1826117388770213888
author	Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou
author2	School of Computer Engineering
author_facet	School of Computer Engineering Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou
author_sort	Wu, Zhizheng
collection	NTU
description	A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method.
first_indexed	2024-10-01T04:26:48Z
format	Journal Article
id	ntu-10356/102726
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T04:26:48Z
publishDate	2013
record_format	dspace
spelling	ntu-10356/1027262020-05-28T07:18:12Z Mixture of factor analyzers using priors from non-parallel speech for voice conversion Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou School of Computer Engineering Temasek Laboratories DRNTU::Engineering::Computer science and engineering A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method. 2013-10-10T08:29:45Z 2019-12-06T20:59:37Z 2013-10-10T08:29:45Z 2019-12-06T20:59:37Z 2012 2012 Journal Article Wu, Z., Kinnunen, T., Chng, E. S., & Li, H. (2012). Mixture of factor analyzers using priors from non-parallel speech for voice conversion. IEEE signal processing letters, 19(12), 914-917. 1070-9908 https://hdl.handle.net/10356/102726 http://hdl.handle.net/10220/16436 10.1109/LSP.2012.2225615 en IEEE signal processing letters © 2012 IEEE
spellingShingle	DRNTU::Engineering::Computer science and engineering Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title	Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_full	Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_fullStr	Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_full_unstemmed	Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_short	Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_sort	mixture of factor analyzers using priors from non parallel speech for voice conversion
topic	DRNTU::Engineering::Computer science and engineering
url	https://hdl.handle.net/10356/102726 http://hdl.handle.net/10220/16436
work_keys_str_mv	AT wuzhizheng mixtureoffactoranalyzersusingpriorsfromnonparallelspeechforvoiceconversion AT kinnunentomi mixtureoffactoranalyzersusingpriorsfromnonparallelspeechforvoiceconversion AT chngengsiong mixtureoffactoranalyzersusingpriorsfromnonparallelspeechforvoiceconversion AT lihaizhou mixtureoffactoranalyzersusingpriorsfromnonparallelspeechforvoiceconversion

Mixture of factor analyzers using priors from non-parallel speech for voice conversion

Similar Items