An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition

We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to...

Full description

Bibliographic Details
Main Authors: Raj, Bhiksha, Turicchia, Lorenzo, Schmidt-Nielsen, Bent, Sarpeshkar, Rahul
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:English
Published: Hindawi Publishing Corporation 2011
Online Access:http://hdl.handle.net/1721.1/67033
https://orcid.org/0000-0003-0384-3786
_version_ 1826203666846056448
author Raj, Bhiksha
Turicchia, Lorenzo
Schmidt-Nielsen, Bent
Sarpeshkar, Rahul
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Raj, Bhiksha
Turicchia, Lorenzo
Schmidt-Nielsen, Bent
Sarpeshkar, Rahul
author_sort Raj, Bhiksha
collection MIT
description We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at -5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (-5 dB SNR to +5 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations.
first_indexed 2024-09-23T12:41:10Z
format Article
id mit-1721.1/67033
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T12:41:10Z
publishDate 2011
publisher Hindawi Publishing Corporation
record_format dspace
spelling mit-1721.1/670332022-09-28T09:24:39Z An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition Raj, Bhiksha Turicchia, Lorenzo Schmidt-Nielsen, Bent Sarpeshkar, Rahul Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Research Laboratory of Electronics Turicchia, Lorenzo Turicchia, Lorenzo Sarpeshkar, Rahul We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at -5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (-5 dB SNR to +5 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations. 2011-11-16T13:45:46Z 2011-11-16T13:45:46Z 2007-06 2006-11 2011-09-23T17:09:42Z Article http://purl.org/eprint/type/JournalArticle 1687-4714 1687-4722 http://hdl.handle.net/1721.1/67033 EURASIP Journal on Audio, Speech, and Music Processing. 2007 Jun 26;2007(1):065420 https://orcid.org/0000-0003-0384-3786 en http://dx.doi.org/10.1155/2007/65420 EURASIP Journal on Audio, Speech, and Music Processing Creative Commons Attribution http://creativecommons.org/licenses/by/2.0 et al.; licensee BioMed Central Ltd. application/pdf Hindawi Publishing Corporation
spellingShingle Raj, Bhiksha
Turicchia, Lorenzo
Schmidt-Nielsen, Bent
Sarpeshkar, Rahul
An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
title An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
title_full An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
title_fullStr An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
title_full_unstemmed An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
title_short An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
title_sort fft based companding front end for noise robust automatic speech recognition
url http://hdl.handle.net/1721.1/67033
https://orcid.org/0000-0003-0384-3786
work_keys_str_mv AT rajbhiksha anfftbasedcompandingfrontendfornoiserobustautomaticspeechrecognition
AT turicchialorenzo anfftbasedcompandingfrontendfornoiserobustautomaticspeechrecognition
AT schmidtnielsenbent anfftbasedcompandingfrontendfornoiserobustautomaticspeechrecognition
AT sarpeshkarrahul anfftbasedcompandingfrontendfornoiserobustautomaticspeechrecognition
AT rajbhiksha fftbasedcompandingfrontendfornoiserobustautomaticspeechrecognition
AT turicchialorenzo fftbasedcompandingfrontendfornoiserobustautomaticspeechrecognition
AT schmidtnielsenbent fftbasedcompandingfrontendfornoiserobustautomaticspeechrecognition
AT sarpeshkarrahul fftbasedcompandingfrontendfornoiserobustautomaticspeechrecognition