Domain adversarial training for speech enhancement

The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution...

Full description

Bibliographic Details
Main Authors:	Hou, Nana, Xu, Chenglin, Chng, Eng Siong, Li, Haizhou
Other Authors:	School of Computer Science and Engineering
Format:	Conference Paper
Language:	English
Published:	2020
Subjects:	Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement
Online Access:	https://hdl.handle.net/10356/144786

_version_	1826129058217328640
author	Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou
author_sort	Hou, Nana
collection	NTU
description	The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution to the scenario where we only have noisy speech data, and we don’t have clean-noisy parallel data in the new domain. Specifically, our method includes two parts that are jointly trained, 1) an enhancement net to map noisy speech to clean speech by indirectly estimating a mask with a spectrum approximation loss, and 2) a domain predictor to distinguish between domains. As the proposed approach is able to adapt to a new domain only with noisy speech data in target domain, we call it an unsupervised learning technique. Experiments suggest that our approach delivers voice quality comparable with other supervised learning techniques that require clean-noisy parallel data.
first_indexed	2024-10-01T07:34:32Z
format	Conference Paper
id	ntu-10356/144786
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T07:34:32Z
publishDate	2020
record_format	dspace
spelling	ntu-10356/1447862020-11-28T20:10:37Z Domain adversarial training for speech enhancement Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou School of Computer Science and Engineering 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Air Traffic Management Research Institute Temasek Laboratories @ NTU Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution to the scenario where we only have noisy speech data, and we don’t have clean-noisy parallel data in the new domain. Specifically, our method includes two parts that are jointly trained, 1) an enhancement net to map noisy speech to clean speech by indirectly estimating a mask with a spectrum approximation loss, and 2) a domain predictor to distinguish between domains. As the proposed approach is able to adapt to a new domain only with noisy speech data in target domain, we call it an unsupervised learning technique. Experiments suggest that our approach delivers voice quality comparable with other supervised learning techniques that require clean-noisy parallel data. Accepted version This research is supported by Temasek Laboratories@NTU, Nanyang Technological University, Singapore. 2020-11-24T06:30:28Z 2020-11-24T06:30:28Z 2019 Conference Paper Hou, N., Xu, C., Chng, E. S., & Li, H. (2019). Domain adversarial training for speech enhancement. Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 667-672. doi:10.1109/APSIPAASC47483.2019.9023218 https://hdl.handle.net/10356/144786 10.1109/APSIPAASC47483.2019.9023218 667 672 en © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/APSIPAASC47483.2019.9023218 application/pdf
spellingShingle	Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou Domain adversarial training for speech enhancement
title	Domain adversarial training for speech enhancement
title_full	Domain adversarial training for speech enhancement
title_fullStr	Domain adversarial training for speech enhancement
title_full_unstemmed	Domain adversarial training for speech enhancement
title_short	Domain adversarial training for speech enhancement
title_sort	domain adversarial training for speech enhancement
topic	Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement
url	https://hdl.handle.net/10356/144786
work_keys_str_mv	AT hounana domainadversarialtrainingforspeechenhancement AT xuchenglin domainadversarialtrainingforspeechenhancement AT chngengsiong domainadversarialtrainingforspeechenhancement AT lihaizhou domainadversarialtrainingforspeechenhancement

Domain adversarial training for speech enhancement

Similar Items