A lip geometry approach for feature-fusion based audio-visual speech recognition

This paper describes a feature-fusion audio-visual speech recognition (AVSR) system that extracts lip geometry from the mouth region using a combination of skin color filter, border following and convex hull, and classification using a Hidden Markov Model. By defining a small number of highly descri...

Full description

Bibliographic Details
Main Authors:	M. Z., Ibrahim, Mulvaney, D. J.
Format:	Conference or Workshop Item
Language:	English English
Published:	IEEE 2014
Subjects:	TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://umpir.ump.edu.my/id/eprint/29900/1/A%20lip%20geometry%20approach%20for%20feature-fusion%20based%20audio.pdf http://umpir.ump.edu.my/id/eprint/29900/2/A%20lip%20geometry%20approach%20for%20feature-fusion%20based%20audio_FULL.pdf

_version_	1825813597425500160
author	M. Z., Ibrahim Mulvaney, D. J.
author_facet	M. Z., Ibrahim Mulvaney, D. J.
author_sort	M. Z., Ibrahim
collection	UMP
description	This paper describes a feature-fusion audio-visual speech recognition (AVSR) system that extracts lip geometry from the mouth region using a combination of skin color filter, border following and convex hull, and classification using a Hidden Markov Model. By defining a small number of highly descriptive geometrical features relevant to the recognition task, the approach avoids the poor scalability (termed the `curse of dimensionality') that is often associated with featurefusion AVSR methods. The paper describes comparisons of the new approach with conventional appearance-based methods, namely the discrete cosine transform and the principal component analysis techniques, when operating under simulated ambient noise conditions that affect the spoken phrases. The experimental results demonstrate that, in the presence of audio noise, the geometrical method significantly improves speech recognition accuracy compared with appearance-based approaches, despite the new method requiring significantly fewer features.
first_indexed	2024-03-06T12:46:23Z
format	Conference or Workshop Item
id	UMPir29900
institution	Universiti Malaysia Pahang
language	English English
last_indexed	2024-03-06T12:46:23Z
publishDate	2014
publisher	IEEE
record_format	dspace
spelling	UMPir299002022-12-28T04:07:45Z http://umpir.ump.edu.my/id/eprint/29900/ A lip geometry approach for feature-fusion based audio-visual speech recognition M. Z., Ibrahim Mulvaney, D. J. TK Electrical engineering. Electronics Nuclear engineering This paper describes a feature-fusion audio-visual speech recognition (AVSR) system that extracts lip geometry from the mouth region using a combination of skin color filter, border following and convex hull, and classification using a Hidden Markov Model. By defining a small number of highly descriptive geometrical features relevant to the recognition task, the approach avoids the poor scalability (termed the `curse of dimensionality') that is often associated with featurefusion AVSR methods. The paper describes comparisons of the new approach with conventional appearance-based methods, namely the discrete cosine transform and the principal component analysis techniques, when operating under simulated ambient noise conditions that affect the spoken phrases. The experimental results demonstrate that, in the presence of audio noise, the geometrical method significantly improves speech recognition accuracy compared with appearance-based approaches, despite the new method requiring significantly fewer features. IEEE 2014 Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/29900/1/A%20lip%20geometry%20approach%20for%20feature-fusion%20based%20audio.pdf pdf en http://umpir.ump.edu.my/id/eprint/29900/2/A%20lip%20geometry%20approach%20for%20feature-fusion%20based%20audio_FULL.pdf M. Z., Ibrahim and Mulvaney, D. J. (2014) A lip geometry approach for feature-fusion based audio-visual speech recognition. In: 6th International Symposium on Communications, Control and Signal Processing, ISCCSP 2014 , 21 - 23 May 2014 , Athens, Greece. pp. 644-647. (6877957). ISBN 9781479928903 (Published) https://doi.org/10.1109/ISCCSP.2014.6877957
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering M. Z., Ibrahim Mulvaney, D. J. A lip geometry approach for feature-fusion based audio-visual speech recognition
title	A lip geometry approach for feature-fusion based audio-visual speech recognition
title_full	A lip geometry approach for feature-fusion based audio-visual speech recognition
title_fullStr	A lip geometry approach for feature-fusion based audio-visual speech recognition
title_full_unstemmed	A lip geometry approach for feature-fusion based audio-visual speech recognition
title_short	A lip geometry approach for feature-fusion based audio-visual speech recognition
title_sort	lip geometry approach for feature fusion based audio visual speech recognition
topic	TK Electrical engineering. Electronics Nuclear engineering
url	http://umpir.ump.edu.my/id/eprint/29900/1/A%20lip%20geometry%20approach%20for%20feature-fusion%20based%20audio.pdf http://umpir.ump.edu.my/id/eprint/29900/2/A%20lip%20geometry%20approach%20for%20feature-fusion%20based%20audio_FULL.pdf
work_keys_str_mv	AT mzibrahim alipgeometryapproachforfeaturefusionbasedaudiovisualspeechrecognition AT mulvaneydj alipgeometryapproachforfeaturefusionbasedaudiovisualspeechrecognition AT mzibrahim lipgeometryapproachforfeaturefusionbasedaudiovisualspeechrecognition AT mulvaneydj lipgeometryapproachforfeaturefusionbasedaudiovisualspeechrecognition

A lip geometry approach for feature-fusion based audio-visual speech recognition

Similar Items