Multichannel equalization applied to speech dereverberation

Speech signals acquired by a distant microphone inside an enclosed space is often degraded by reverberation. Reverberation results from the multipath propagation of a sound wave from its source to receivers. Reverberation can cause a detrimental effect on the perceived quality as well as the intelli...

Full description

Bibliographic Details
Main Author: Rajan Sobhana Rashobh
Other Authors: Andy Khong Wai Hoong
Format: Thesis
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/62174
_version_ 1811689483220811776
author Rajan Sobhana Rashobh
author2 Andy Khong Wai Hoong
author_facet Andy Khong Wai Hoong
Rajan Sobhana Rashobh
author_sort Rajan Sobhana Rashobh
collection NTU
description Speech signals acquired by a distant microphone inside an enclosed space is often degraded by reverberation. Reverberation results from the multipath propagation of a sound wave from its source to receivers. Reverberation can cause a detrimental effect on the perceived quality as well as the intelligibility of the speech signals. This results in performance degradation of systems such as hand-free telephony, hearing aids, and automatic speech/speaker recognition systems. One of the popular approaches to mitigate the effects of reverberation is to achieve channel equalization via a two-stage process where acoustic impulse responses (AIRs) are first estimated using blind channel identification (BCI) techniques after which the received signals are filtered using inverse filters computed from the estimated AIRs. This thesis focuses on speech dereverberation employing BCI and inverse filtering. A typical AIR is often non-minimum phase and its direct inversion will result in an unstable inverse filter. Multichannnel equalization (MCEQ) algorithms developed for use with a microphone array are employed for the equalization of such non-minimum phase AIRs. Existing MCEQ algorithms achieve equalization in the time domain and in this thesis, a generalized framework that allows one to achieve equalization in different transform domains is proposed first. This is motivated from the fact that when equalization is performed on different domains, the inherent properties of the transforms can be exploited to achieve better equalization performance. Noting that the computational complexity of the non-adaptive MCEQ algorithm is proportional to the AIR order, a set of adaptive time-domain MCEQ algorithms are proposed to achieve equalization of high-order AIRs with reduced complexity. These algorithms iteratively estimate the inverse filters by minimizing a cost function. To improve the convergence as well as equalization performance, the sparsity of the desired equalized response is taken into account in the cost function and update equation. Although the time-domain adaptive algorithms reduce complexity, they suffer from slow convergence. To overcome this limitation, complexity reduction in the frequency domain is exploited. The proposed algorithm which achieves equalization in each frequency bin is derived from the proposed generalized framework for MCEQ. It is shown that the proposed algorithm significantly reduces the complexity involved in MCEQ and exhibits higher robustness to channel estimation errors. To further reduce the processing time of the proposed frequency domain MCEQ algorithm, adaptive filtering techniques are introduced. To achieve convergence in a single step, an optimal step size is derived for the proposed adaptive algorithm. Finally, a frequency-domain adaptive BCI algorithm is proposed for the estimation of unknown channels. The proposed algorithm exploits the spatial diversity of a multichannel system and estimates the AIRs based on the cross-relation among the channels. To gain more insights into its performance, the misconvergence problem is analyzed and based on this analysis, a penalty term derived from a sparseness constraint is introduced to the cost function for noise robustness.
first_indexed 2024-10-01T05:48:49Z
format Thesis
id ntu-10356/62174
institution Nanyang Technological University
language English
last_indexed 2024-10-01T05:48:49Z
publishDate 2015
record_format dspace
spelling ntu-10356/621742023-07-04T17:15:36Z Multichannel equalization applied to speech dereverberation Rajan Sobhana Rashobh Andy Khong Wai Hoong School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Speech signals acquired by a distant microphone inside an enclosed space is often degraded by reverberation. Reverberation results from the multipath propagation of a sound wave from its source to receivers. Reverberation can cause a detrimental effect on the perceived quality as well as the intelligibility of the speech signals. This results in performance degradation of systems such as hand-free telephony, hearing aids, and automatic speech/speaker recognition systems. One of the popular approaches to mitigate the effects of reverberation is to achieve channel equalization via a two-stage process where acoustic impulse responses (AIRs) are first estimated using blind channel identification (BCI) techniques after which the received signals are filtered using inverse filters computed from the estimated AIRs. This thesis focuses on speech dereverberation employing BCI and inverse filtering. A typical AIR is often non-minimum phase and its direct inversion will result in an unstable inverse filter. Multichannnel equalization (MCEQ) algorithms developed for use with a microphone array are employed for the equalization of such non-minimum phase AIRs. Existing MCEQ algorithms achieve equalization in the time domain and in this thesis, a generalized framework that allows one to achieve equalization in different transform domains is proposed first. This is motivated from the fact that when equalization is performed on different domains, the inherent properties of the transforms can be exploited to achieve better equalization performance. Noting that the computational complexity of the non-adaptive MCEQ algorithm is proportional to the AIR order, a set of adaptive time-domain MCEQ algorithms are proposed to achieve equalization of high-order AIRs with reduced complexity. These algorithms iteratively estimate the inverse filters by minimizing a cost function. To improve the convergence as well as equalization performance, the sparsity of the desired equalized response is taken into account in the cost function and update equation. Although the time-domain adaptive algorithms reduce complexity, they suffer from slow convergence. To overcome this limitation, complexity reduction in the frequency domain is exploited. The proposed algorithm which achieves equalization in each frequency bin is derived from the proposed generalized framework for MCEQ. It is shown that the proposed algorithm significantly reduces the complexity involved in MCEQ and exhibits higher robustness to channel estimation errors. To further reduce the processing time of the proposed frequency domain MCEQ algorithm, adaptive filtering techniques are introduced. To achieve convergence in a single step, an optimal step size is derived for the proposed adaptive algorithm. Finally, a frequency-domain adaptive BCI algorithm is proposed for the estimation of unknown channels. The proposed algorithm exploits the spatial diversity of a multichannel system and estimates the AIRs based on the cross-relation among the channels. To gain more insights into its performance, the misconvergence problem is analyzed and based on this analysis, a penalty term derived from a sparseness constraint is introduced to the cost function for noise robustness. DOCTOR OF PHILOSOPHY (EEE) 2015-02-25T01:35:50Z 2015-02-25T01:35:50Z 2015 2015 Thesis Rajan Sobhana Rashobh. (2015). Multichannel equalization applied to speech dereverberation. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/62174 10.32657/10356/62174 en 211 p. application/pdf
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Rajan Sobhana Rashobh
Multichannel equalization applied to speech dereverberation
title Multichannel equalization applied to speech dereverberation
title_full Multichannel equalization applied to speech dereverberation
title_fullStr Multichannel equalization applied to speech dereverberation
title_full_unstemmed Multichannel equalization applied to speech dereverberation
title_short Multichannel equalization applied to speech dereverberation
title_sort multichannel equalization applied to speech dereverberation
topic DRNTU::Engineering::Electrical and electronic engineering
url https://hdl.handle.net/10356/62174
work_keys_str_mv AT rajansobhanarashobh multichannelequalizationappliedtospeechdereverberation