Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System

Recently, automatic lips reading ALR acquired a significant interest among many researchers due to its adoption in many applications. One such application is in speech recognition system in noisy environment, where visual cue that contain some integral information added to the audio signal, as well...

Full description

Bibliographic Details
Main Authors:	Mahmuod Mahmmed, Thamir Saeed, Wissam Ali
Format:	Article
Language:	English
Published:	Unviversity of Technology- Iraq 2018-02-01
Series:	Engineering and Technology Journal
Subjects:	visual speech feature extraction av letters recognition classification
Online Access:	https://etj.uotechnology.edu.iq/article_175018_5804480a9db8ab0b4f41f51b3fe937bd.pdf

_version_	1797326046271373312
author	Mahmuod Mahmmed Thamir Saeed Wissam Ali
author_facet	Mahmuod Mahmmed Thamir Saeed Wissam Ali
author_sort	Mahmuod Mahmmed
collection	DOAJ
description	Recently, automatic lips reading ALR acquired a significant interest among many researchers due to its adoption in many applications. One such application is in speech recognition system in noisy environment, where visual cue that contain some integral information added to the audio signal, as well as the way that person merges audio-visual stimulus to identify utterance. The unsolved part of this problem is the utterance classification using only the visual cues without the availability of acoustic signal of the talker's speech. By taking into considerations a set of frames from recorded video for a person uttering a word; a robust image processing technique is used to isolate the lips region, then suitable features are extracted that represent the mouth shape variation during speech. These features are used by the classification stage to identify the uttered word. This paper is solve this problem by introducing a new segmentation technique to isolate the lips region together with a set of visual features base on the extracted lips boundary which able to perform lips reading with significant result. A special laboratory is designed to collect the utterance of twenty six English letters from a multiple speakers which are adopted in this paper (UOTEletters corpus). Moreover; two type of classifier (using Numeral Virtual generalization (NVG) RAM and K nearest neighborhood KNN) where adopted to identify the talker’s utterance. The recognition performance for the input visual utterance when using NVG RAM is 94.679%, which is utilized for the first time in this work. While; 92.628% when KNN is utilize.
first_indexed	2024-03-08T06:17:57Z
format	Article
id	doaj.art-e7672fcfed5542bcb0d183b00db3a2b0
institution	Directory Open Access Journal
issn	1681-6900 2412-0758
language	English
last_indexed	2024-03-08T06:17:57Z
publishDate	2018-02-01
publisher	Unviversity of Technology- Iraq
record_format	Article
series	Engineering and Technology Journal
spelling	doaj.art-e7672fcfed5542bcb0d183b00db3a2b02024-02-04T17:15:41ZengUnviversity of Technology- IraqEngineering and Technology Journal1681-69002412-07582018-02-01362A13614510.30684/etj.36.2A.4175018Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition SystemMahmuod Mahmmed0Thamir Saeed1Wissam AliDept. of Electrical Engineering University of Technology, Baghdad, Iraq.Dept. of Electrical Engineering University of Technology, Baghdad, Iraq.Recently, automatic lips reading ALR acquired a significant interest among many researchers due to its adoption in many applications. One such application is in speech recognition system in noisy environment, where visual cue that contain some integral information added to the audio signal, as well as the way that person merges audio-visual stimulus to identify utterance. The unsolved part of this problem is the utterance classification using only the visual cues without the availability of acoustic signal of the talker's speech. By taking into considerations a set of frames from recorded video for a person uttering a word; a robust image processing technique is used to isolate the lips region, then suitable features are extracted that represent the mouth shape variation during speech. These features are used by the classification stage to identify the uttered word. This paper is solve this problem by introducing a new segmentation technique to isolate the lips region together with a set of visual features base on the extracted lips boundary which able to perform lips reading with significant result. A special laboratory is designed to collect the utterance of twenty six English letters from a multiple speakers which are adopted in this paper (UOTEletters corpus). Moreover; two type of classifier (using Numeral Virtual generalization (NVG) RAM and K nearest neighborhood KNN) where adopted to identify the talker’s utterance. The recognition performance for the input visual utterance when using NVG RAM is 94.679%, which is utilized for the first time in this work. While; 92.628% when KNN is utilize.https://etj.uotechnology.edu.iq/article_175018_5804480a9db8ab0b4f41f51b3fe937bd.pdfvisual speechfeature extractionav letters recognitionclassification
spellingShingle	Mahmuod Mahmmed Thamir Saeed Wissam Ali Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System Engineering and Technology Journal visual speech feature extraction av letters recognition classification
title	Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System
title_full	Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System
title_fullStr	Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System
title_full_unstemmed	Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System
title_short	Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System
title_sort	robust visual lips feature extraction method for improved visual speech recognition system
topic	visual speech feature extraction av letters recognition classification
url	https://etj.uotechnology.edu.iq/article_175018_5804480a9db8ab0b4f41f51b3fe937bd.pdf
work_keys_str_mv	AT mahmuodmahmmed robustvisuallipsfeatureextractionmethodforimprovedvisualspeechrecognitionsystem AT thamirsaeed robustvisuallipsfeatureextractionmethodforimprovedvisualspeechrecognitionsystem AT wissamali robustvisuallipsfeatureextractionmethodforimprovedvisualspeechrecognitionsystem

Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System

Similar Items