Sub-word level lip reading with visual attention
The goal of this paper is to learn strong lip reading models that can recognise speech in silent videos. Most prior works deal with the open-set visual speech recognition problem by adapting existing automatic speech recognition techniques on top of trivially pooled visual features. Instead, in this...
Autors principals: | Prajwal, KR, Afouras, T, Zisserman, A |
---|---|
Format: | Conference item |
Idioma: | English |
Publicat: |
IEEE
2022
|
Ítems similars
-
Speech recognition models are strong lip-readers
per: Prajwal, KR, et al.
Publicat: (2024) -
Visual keyword spotting with attention
per: Prajwal, KR, et al.
Publicat: (2022) -
Deep lip reading: a comparison of models and an online application
per: Afouras, T, et al.
Publicat: (2018) -
ASR is all you need: cross-modal distillation for lip reading
per: Afouras, T, et al.
Publicat: (2020) -
Learning to lip read words by watching videos
per: Chung, J, et al.
Publicat: (2018)