Psychoacoustic model for robust speech recognition

This thesis presents a detailed study on psychoacoustic modeling for feature extraction for robust speech recognition. In an automatic speech recognition (ASR) system, feature extraction is critical to determining the recognizer's performance. The most popular feature vectors for ASR are Mel Fr...

Full description

Bibliographic Details
Main Author:	Luo, Xue Wen
Other Authors:	Soon Ing Yann
Format:	Thesis
Language:	English
Published:	2010
Subjects:	DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
Online Access:	https://hdl.handle.net/10356/41749

_version_	1826115543710564352
author	Luo, Xue Wen
author2	Soon Ing Yann
author_facet	Soon Ing Yann Luo, Xue Wen
author_sort	Luo, Xue Wen
collection	NTU
description	This thesis presents a detailed study on psychoacoustic modeling for feature extraction for robust speech recognition. In an automatic speech recognition (ASR) system, feature extraction is critical to determining the recognizer's performance. The most popular feature vectors for ASR are Mel Frequency Cepstral Coefficients (MFCC). However, it is also well known that its performance drops dramatically under noisy condition. One of the objectives of this thesis is to improve the robustness of a recognizer. Compared to an ASR system, human is good at tolerating background noise, hence psychoacoustic modeling of human hearing system is investigated and integrated into speech features extraction process of a speech recognizer to increase the robustness of it.
first_indexed	2024-10-01T03:56:54Z
format	Thesis
id	ntu-10356/41749
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T03:56:54Z
publishDate	2010
record_format	dspace
spelling	ntu-10356/417492023-07-04T17:05:46Z Psychoacoustic model for robust speech recognition Luo, Xue Wen Soon Ing Yann School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing This thesis presents a detailed study on psychoacoustic modeling for feature extraction for robust speech recognition. In an automatic speech recognition (ASR) system, feature extraction is critical to determining the recognizer's performance. The most popular feature vectors for ASR are Mel Frequency Cepstral Coefficients (MFCC). However, it is also well known that its performance drops dramatically under noisy condition. One of the objectives of this thesis is to improve the robustness of a recognizer. Compared to an ASR system, human is good at tolerating background noise, hence psychoacoustic modeling of human hearing system is investigated and integrated into speech features extraction process of a speech recognizer to increase the robustness of it. MASTER OF ENGINEERING (EEE) 2010-08-06T07:21:23Z 2010-08-06T07:21:23Z 2008 2008 Thesis Luo, X. W. (2008). Psychoacoustic model for robust speech recognition. Master’s thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/41749 10.32657/10356/41749 en 108 p. application/pdf
spellingShingle	DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing Luo, Xue Wen Psychoacoustic model for robust speech recognition
title	Psychoacoustic model for robust speech recognition
title_full	Psychoacoustic model for robust speech recognition
title_fullStr	Psychoacoustic model for robust speech recognition
title_full_unstemmed	Psychoacoustic model for robust speech recognition
title_short	Psychoacoustic model for robust speech recognition
title_sort	psychoacoustic model for robust speech recognition
topic	DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
url	https://hdl.handle.net/10356/41749
work_keys_str_mv	AT luoxuewen psychoacousticmodelforrobustspeechrecognition

Psychoacoustic model for robust speech recognition

Similar Items