A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural Networks

The presence of a large amount of echoes significantly impairs the quality and intelligibility of speech during communication. To address this issue, numerous studies and models have been conducted to cancel echo. In this study, we propose a multi-stage acoustic echo cancellation model that utilizes...

Full description

Bibliographic Details
Main Authors: Shiyun Xu, Changjun He, Bosong Yan, Mingjiang Wang
Format: Article
Language:English
Published: MDPI AG 2023-07-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/15/3258
_version_ 1797586867221168128
author Shiyun Xu
Changjun He
Bosong Yan
Mingjiang Wang
author_facet Shiyun Xu
Changjun He
Bosong Yan
Mingjiang Wang
author_sort Shiyun Xu
collection DOAJ
description The presence of a large amount of echoes significantly impairs the quality and intelligibility of speech during communication. To address this issue, numerous studies and models have been conducted to cancel echo. In this study, we propose a multi-stage acoustic echo cancellation model that utilizes an adaptive filter and a deep neural network. Our model consists of two parts: the Speex algorithm for canceling linear echo, and the multi-scale time-frequency UNet (MSTFUNet) for further echo cancellation. The Speex algorithm takes the far-end reference speech and the near-end microphone signal as inputs, and outputs the signal after linear echo cancellation. MSTFUNet takes the spectra of the far-end reference speech, the near-end microphone signal, and the output of Speex as inputs, and generates the estimated near-end speech spectrum as output. To enhance the performance of the Speex algorithm, we conduct delay estimation and compensation to the far-end reference speech. For MSTFUNet, we employ multi-scale time-frequency processing to extract information from the input spectrum. Additionally, we incorporate an improved time-frequency self-attention to capture time-frequency information. Furthermore, we introduce channel time-frequency attention to alleviate information loss during downsampling and upsampling. In our experiments, we evaluate the performance of our proposed model on both our test set and the blind test set of the Acoustic Echo Cancellation challenge. Our proposed model exhibits superior performance in terms of acoustic echo cancellation and noise reverberation suppression compared to other models.
first_indexed 2024-03-11T00:29:15Z
format Article
id doaj.art-3d2b8e6cd9c34a35a5a59a7574ab8bc3
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-11T00:29:15Z
publishDate 2023-07-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-3d2b8e6cd9c34a35a5a59a7574ab8bc32023-11-18T22:48:36ZengMDPI AGElectronics2079-92922023-07-011215325810.3390/electronics12153258A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural NetworksShiyun Xu0Changjun He1Bosong Yan2Mingjiang Wang3Key Laboratory for Key Technologies of IoT Terminals, Harbin Institute of Technology, Shenzhen 518055, ChinaKey Laboratory for Key Technologies of IoT Terminals, Harbin Institute of Technology, Shenzhen 518055, ChinaKey Laboratory for Key Technologies of IoT Terminals, Harbin Institute of Technology, Shenzhen 518055, ChinaKey Laboratory for Key Technologies of IoT Terminals, Harbin Institute of Technology, Shenzhen 518055, ChinaThe presence of a large amount of echoes significantly impairs the quality and intelligibility of speech during communication. To address this issue, numerous studies and models have been conducted to cancel echo. In this study, we propose a multi-stage acoustic echo cancellation model that utilizes an adaptive filter and a deep neural network. Our model consists of two parts: the Speex algorithm for canceling linear echo, and the multi-scale time-frequency UNet (MSTFUNet) for further echo cancellation. The Speex algorithm takes the far-end reference speech and the near-end microphone signal as inputs, and outputs the signal after linear echo cancellation. MSTFUNet takes the spectra of the far-end reference speech, the near-end microphone signal, and the output of Speex as inputs, and generates the estimated near-end speech spectrum as output. To enhance the performance of the Speex algorithm, we conduct delay estimation and compensation to the far-end reference speech. For MSTFUNet, we employ multi-scale time-frequency processing to extract information from the input spectrum. Additionally, we incorporate an improved time-frequency self-attention to capture time-frequency information. Furthermore, we introduce channel time-frequency attention to alleviate information loss during downsampling and upsampling. In our experiments, we evaluate the performance of our proposed model on both our test set and the blind test set of the Acoustic Echo Cancellation challenge. Our proposed model exhibits superior performance in terms of acoustic echo cancellation and noise reverberation suppression compared to other models.https://www.mdpi.com/2079-9292/12/15/3258acoustic echo cancellationmulti-stage modeladaptive filterdeep neural network
spellingShingle Shiyun Xu
Changjun He
Bosong Yan
Mingjiang Wang
A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural Networks
Electronics
acoustic echo cancellation
multi-stage model
adaptive filter
deep neural network
title A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural Networks
title_full A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural Networks
title_fullStr A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural Networks
title_full_unstemmed A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural Networks
title_short A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural Networks
title_sort multi stage acoustic echo cancellation model based on adaptive filters and deep neural networks
topic acoustic echo cancellation
multi-stage model
adaptive filter
deep neural network
url https://www.mdpi.com/2079-9292/12/15/3258
work_keys_str_mv AT shiyunxu amultistageacousticechocancellationmodelbasedonadaptivefiltersanddeepneuralnetworks
AT changjunhe amultistageacousticechocancellationmodelbasedonadaptivefiltersanddeepneuralnetworks
AT bosongyan amultistageacousticechocancellationmodelbasedonadaptivefiltersanddeepneuralnetworks
AT mingjiangwang amultistageacousticechocancellationmodelbasedonadaptivefiltersanddeepneuralnetworks
AT shiyunxu multistageacousticechocancellationmodelbasedonadaptivefiltersanddeepneuralnetworks
AT changjunhe multistageacousticechocancellationmodelbasedonadaptivefiltersanddeepneuralnetworks
AT bosongyan multistageacousticechocancellationmodelbasedonadaptivefiltersanddeepneuralnetworks
AT mingjiangwang multistageacousticechocancellationmodelbasedonadaptivefiltersanddeepneuralnetworks