A Novel Approach to Speech Enhancement Based on Deep Neural Networks

Minimum mean-square error (MMSE) approaches have been shown to achieve state-of-the-art performance on the task of speech enhancement. However, MMSE approaches lack the ability to accurately estimate non-stationary noise sources. In this paper, a long short-term memory fully convolutional network...

Full description

Bibliographic Details
Main Authors: SALEHI, M., MIRZAKUCHAKI, S.
Format: Article
Language:English
Published: Stefan cel Mare University of Suceava 2022-05-01
Series:Advances in Electrical and Computer Engineering
Subjects:
Online Access:http://dx.doi.org/10.4316/AECE.2022.02009
Description
Summary:Minimum mean-square error (MMSE) approaches have been shown to achieve state-of-the-art performance on the task of speech enhancement. However, MMSE approaches lack the ability to accurately estimate non-stationary noise sources. In this paper, a long short-term memory fully convolutional network (LSTM-FCN) is utilized to accurately estimate a priori signal-to-noise ratio (SNR) since the speech enhancement performance of an MMSE approach improves with the accuracy of the used a priori SNR estimator. The proposed MMSE approach makes no assumptions about the characteristics of the noise or the speech. MMSE approaches that utilize the LSTM-FCN estimator are evaluated using the mean opinion score of the perceptual evaluation of speech quality (PESQ) and the short-time objective intelligibility (STOI) measures of speech. The experimental investigation shows that the speech enhancement performance of an MMSE approach that utilizes LSTM-FCN estimator significantly increases.
ISSN:1582-7445
1844-7600