Lipreading Architecture Based on Multiple Convolutional Neural Networks for Sentence-Level Visual Speech Recognition

In visual speech recognition (VSR), speech is transcribed using only visual information to interpret tongue and teeth movements. Recently, deep learning has shown outstanding performance in VSR, with accuracy exceeding that of lipreaders on benchmark datasets. However, several problems still exist w...

Full description

Bibliographic Details
Main Authors:	Sanghun Jeon, Ahmed Elsharkawy, Mun Sang Kim
Format:	Article
Language:	English
Published:	MDPI AG 2021-12-01
Series:	Sensors
Subjects:	3D densely connected CNN 3D multi-layer feature fusion CNN convolutional neural network deep learning lipreading speech recognition
Online Access:	https://www.mdpi.com/1424-8220/22/1/72

Internet

https://www.mdpi.com/1424-8220/22/1/72

Lipreading Architecture Based on Multiple Convolutional Neural Networks for Sentence-Level Visual Speech Recognition

Internet

Similar Items