Sub-word level lip reading with visual attention

The goal of this paper is to learn strong lip reading models that can recognise speech in silent videos. Most prior works deal with the open-set visual speech recognition problem by adapting existing automatic speech recognition techniques on top of trivially pooled visual features. Instead, in this...

全面介紹

書目詳細資料
Main Authors: Prajwal, KR, Afouras, T, Zisserman, A
格式: Conference item
語言:English
出版: IEEE 2022