Domain adversarial training for speech enhancement

The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution...

Full description

Bibliographic Details
Main Authors: Hou, Nana, Xu, Chenglin, Chng, Eng Siong, Li, Haizhou
Other Authors: School of Computer Science and Engineering
Format: Conference Paper
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/144786
_version_ 1826129058217328640
author Hou, Nana
Xu, Chenglin
Chng, Eng Siong
Li, Haizhou
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Hou, Nana
Xu, Chenglin
Chng, Eng Siong
Li, Haizhou
author_sort Hou, Nana
collection NTU
description The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution to the scenario where we only have noisy speech data, and we don’t have clean-noisy parallel data in the new domain. Specifically, our method includes two parts that are jointly trained, 1) an enhancement net to map noisy speech to clean speech by indirectly estimating a mask with a spectrum approximation loss, and 2) a domain predictor to distinguish between domains. As the proposed approach is able to adapt to a new domain only with noisy speech data in target domain, we call it an unsupervised learning technique. Experiments suggest that our approach delivers voice quality comparable with other supervised learning techniques that require clean-noisy parallel data.
first_indexed 2024-10-01T07:34:32Z
format Conference Paper
id ntu-10356/144786
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:34:32Z
publishDate 2020
record_format dspace
spelling ntu-10356/1447862020-11-28T20:10:37Z Domain adversarial training for speech enhancement Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou School of Computer Science and Engineering 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Air Traffic Management Research Institute Temasek Laboratories @ NTU Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution to the scenario where we only have noisy speech data, and we don’t have clean-noisy parallel data in the new domain. Specifically, our method includes two parts that are jointly trained, 1) an enhancement net to map noisy speech to clean speech by indirectly estimating a mask with a spectrum approximation loss, and 2) a domain predictor to distinguish between domains. As the proposed approach is able to adapt to a new domain only with noisy speech data in target domain, we call it an unsupervised learning technique. Experiments suggest that our approach delivers voice quality comparable with other supervised learning techniques that require clean-noisy parallel data. Accepted version This research is supported by Temasek Laboratories@NTU, Nanyang Technological University, Singapore. 2020-11-24T06:30:28Z 2020-11-24T06:30:28Z 2019 Conference Paper Hou, N., Xu, C., Chng, E. S., & Li, H. (2019). Domain adversarial training for speech enhancement. Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 667-672. doi:10.1109/APSIPAASC47483.2019.9023218 https://hdl.handle.net/10356/144786 10.1109/APSIPAASC47483.2019.9023218 667 672 en © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/APSIPAASC47483.2019.9023218 application/pdf
spellingShingle Engineering::Computer science and engineering
Domain Adversarial Training
Speech Enhancement
Hou, Nana
Xu, Chenglin
Chng, Eng Siong
Li, Haizhou
Domain adversarial training for speech enhancement
title Domain adversarial training for speech enhancement
title_full Domain adversarial training for speech enhancement
title_fullStr Domain adversarial training for speech enhancement
title_full_unstemmed Domain adversarial training for speech enhancement
title_short Domain adversarial training for speech enhancement
title_sort domain adversarial training for speech enhancement
topic Engineering::Computer science and engineering
Domain Adversarial Training
Speech Enhancement
url https://hdl.handle.net/10356/144786
work_keys_str_mv AT hounana domainadversarialtrainingforspeechenhancement
AT xuchenglin domainadversarialtrainingforspeechenhancement
AT chngengsiong domainadversarialtrainingforspeechenhancement
AT lihaizhou domainadversarialtrainingforspeechenhancement