Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement

Acoustic echo in full-duplex telecommunication systems is a common problem that may cause desired-speech quality degradation during double-talk periods. This problem is especially challenging in low signal-to-echo ratio (SER) scenarios, such as hands-free conversations over mobile phones when the lo...

Full description

Bibliographic Details
Main Authors:	Eran Shachar, Israel Cohen, Baruch Berdugo
Format:	Article
Language:	English
Published:	MDPI AG 2022-08-01
Series:	Acoustics
Subjects:	residual echo suppression acoustic echo cancellation double-talk detection deep-learning
Online Access:	https://www.mdpi.com/2624-599X/4/3/39

_version_	1797492512828424192
author	Eran Shachar Israel Cohen Baruch Berdugo
author_facet	Eran Shachar Israel Cohen Baruch Berdugo
author_sort	Eran Shachar
collection	DOAJ
description	Acoustic echo in full-duplex telecommunication systems is a common problem that may cause desired-speech quality degradation during double-talk periods. This problem is especially challenging in low signal-to-echo ratio (SER) scenarios, such as hands-free conversations over mobile phones when the loudspeaker volume is high. This paper proposes a two-stage deep-learning approach to residual echo suppression focused on the low SER scenario. The first stage consists of a speech spectrogram masking model integrated with a double-talk detector (DTD). The second stage consists of a spectrogram refinement model optimized for speech quality by minimizing a perceptual evaluation of speech quality (PESQ) related loss function. The proposed integration of DTD with the masking model outperforms several other configurations based on previous studies. We conduct an ablation study that shows the contribution of each part of the proposed system. We evaluate the proposed system in several SERs and demonstrate its efficiency in the challenging setting of a very low SER. Finally, the proposed approach outperforms competing methods in several residual echo suppression metrics. We conclude that the proposed system is well-suited for the task of low SER residual echo suppression.
first_indexed	2024-03-10T01:04:44Z
format	Article
id	doaj.art-aa643fac09e946509002263c94730d69
institution	Directory Open Access Journal
issn	2624-599X
language	English
last_indexed	2024-03-10T01:04:44Z
publishDate	2022-08-01
publisher	MDPI AG
record_format	Article
series	Acoustics
spelling	doaj.art-aa643fac09e946509002263c94730d692023-11-23T14:28:52ZengMDPI AGAcoustics2624-599X2022-08-014363765510.3390/acoustics4030039Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and RefinementEran Shachar0Israel Cohen1Baruch Berdugo2Andrew and Erna Viterbi Faculty of Electrical & Computer Engineering, Technion–Israel Institute of Technology, Technion City, Haifa 3200003, IsraelAndrew and Erna Viterbi Faculty of Electrical & Computer Engineering, Technion–Israel Institute of Technology, Technion City, Haifa 3200003, IsraelAndrew and Erna Viterbi Faculty of Electrical & Computer Engineering, Technion–Israel Institute of Technology, Technion City, Haifa 3200003, IsraelAcoustic echo in full-duplex telecommunication systems is a common problem that may cause desired-speech quality degradation during double-talk periods. This problem is especially challenging in low signal-to-echo ratio (SER) scenarios, such as hands-free conversations over mobile phones when the loudspeaker volume is high. This paper proposes a two-stage deep-learning approach to residual echo suppression focused on the low SER scenario. The first stage consists of a speech spectrogram masking model integrated with a double-talk detector (DTD). The second stage consists of a spectrogram refinement model optimized for speech quality by minimizing a perceptual evaluation of speech quality (PESQ) related loss function. The proposed integration of DTD with the masking model outperforms several other configurations based on previous studies. We conduct an ablation study that shows the contribution of each part of the proposed system. We evaluate the proposed system in several SERs and demonstrate its efficiency in the challenging setting of a very low SER. Finally, the proposed approach outperforms competing methods in several residual echo suppression metrics. We conclude that the proposed system is well-suited for the task of low SER residual echo suppression.https://www.mdpi.com/2624-599X/4/3/39residual echo suppressionacoustic echo cancellationdouble-talk detectiondeep-learning
spellingShingle	Eran Shachar Israel Cohen Baruch Berdugo Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement Acoustics residual echo suppression acoustic echo cancellation double-talk detection deep-learning
title	Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement
title_full	Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement
title_fullStr	Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement
title_full_unstemmed	Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement
title_short	Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement
title_sort	double talk detection aided residual echo suppression via spectrogram masking and refinement
topic	residual echo suppression acoustic echo cancellation double-talk detection deep-learning
url	https://www.mdpi.com/2624-599X/4/3/39
work_keys_str_mv	AT eranshachar doubletalkdetectionaidedresidualechosuppressionviaspectrogrammaskingandrefinement AT israelcohen doubletalkdetectionaidedresidualechosuppressionviaspectrogrammaskingandrefinement AT baruchberdugo doubletalkdetectionaidedresidualechosuppressionviaspectrogrammaskingandrefinement

Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement

Similar Items