Sub-word level lip reading with visual attention
The goal of this paper is to learn strong lip reading models that can recognise speech in silent videos. Most prior works deal with the open-set visual speech recognition problem by adapting existing automatic speech recognition techniques on top of trivially pooled visual features. Instead, in this...
Main Authors: | , , |
---|---|
格式: | Conference item |
語言: | English |
出版: |
IEEE
2022
|