Sub-word level lip reading with visual attention

Sub-word level lip reading with visual attention

The goal of this paper is to learn strong lip reading models that can recognise speech in silent videos. Most prior works deal with the open-set visual speech recognition problem by adapting existing automatic speech recognition techniques on top of trivially pooled visual features. Instead, in this...

Descripció completa

Dades bibliogràfiques
Autors principals:	Prajwal, KR, Afouras, T, Zisserman, A
Format:	Conference item
Idioma:	English
Publicat:	IEEE 2022

Ítems similars

Speech recognition models are strong lip-readers
per: Prajwal, KR, et al.
Publicat: (2024)

Visual keyword spotting with attention
per: Prajwal, KR, et al.
Publicat: (2022)

Deep lip reading: a comparison of models and an online application
per: Afouras, T, et al.
Publicat: (2018)

ASR is all you need: cross-modal distillation for lip reading
per: Afouras, T, et al.
Publicat: (2020)

Learning to lip read words by watching videos
per: Chung, J, et al.
Publicat: (2018)

My lips are concealed: audio-visual speech enhancement through obstructions
per: Afouras, T, et al.
Publicat: (2019)

Seeing wake words: Audio-visual keyword spotting
per: Momeni, L, et al.
Publicat: (2020)

Lip reading in the wild
per: Chung, J, et al.
Publicat: (2017)

Lip reading in profile
per: Chung, J, et al.
Publicat: (2017)

Lip Reading Sentences in the Wild
per: Chung, J, et al.
Publicat: (2017)

Efficient DNN Model for Word Lip-Reading
per: Taiki Arakane, et al.
Publicat: (2023-05-01)

Word boundaries affect visual attention in Chinese reading.
per: Xingshan Li, et al.
Publicat: (2012-01-01)

Reading to listen at the cocktail party: multi-modal speech separation
per: Rahimi, A, et al.
Publicat: (2022)

Read my lips: Artificial intelligence word-level arabic lipreading system
per: Waleed Dweik, et al.
Publicat: (2022-12-01)

Visual Lip Reading Dataset in Turkish
per: Ali Berkol, et al.
Publicat: (2023-01-01)

Now you're speaking my language: visual language identification
per: Afouras, T, et al.
Publicat: (2020)

Read and attend: temporal localisation in sign language videos
per: Varol, G, et al.
Publicat: (2021)

Watch, read and lookup: learning to spot signs from multiple supervisors
per: Momeni, L, et al.
Publicat: (2021)

Self-supervised learning of audio-visual objects from video
per: Afouras, T, et al.
Publicat: (2020)

Audio-visual deep learning
per: Afouras, T, et al.
Publicat: (2021)

Word selectivity in high-level visual cortex and reading skill
per: Emily C. Kubota, et al.
Publicat: (2019-04-01)

Deep audio-visual speech recognition
per: Afouras, T, et al.
Publicat: (2018)

Method for visual analysis of driver's face for automatic lip-reading in the wild
per: A.A. Axyonov, et al.
Publicat: (2022-12-01)

Audio-visual synchronisation in the wild
per: Chen, H, et al.
Publicat: (2021)

Localizing visual sounds the hard way
per: Vedaldi, A, et al.
Publicat: (2021)

Lip Reading in Cantonese
per: Yewei Xiao, et al.
Publicat: (2022-01-01)

Lip Reading Sentences Using Deep Learning With Only Visual Cues
per: Souheil Fenghour, et al.
Publicat: (2020-01-01)

Integrated visual transformer and flash attention for lip-to-speech generation GAN
per: Qiong Yang, et al.
Publicat: (2024-02-01)

Parents of Children with Cleft Lip Exhibit Heightened Visual Attention to the Perioral Area
per: Israa Abuelezz, MSc, et al.
Publicat: (2023-02-01)

Attention modulates initial stages of visual word processing
per: Ruz, M, et al.
Publicat: (2008)

Attention modulates initial stages of visual word processing.
per: Ruz, M, et al.
Publicat: (2008)

Impact of Audio-Visual Asynchrony on Lip-Reading Effects -Neuromagnetic and Psychophysical Study.
per: Tetsuaki Kawase, et al.
Publicat: (2016-01-01)

Machine learning for lip reading
per: Zhao, Han
Publicat: (2018)

Voicevector: multimodal enrolment vectors for speaker separation
per: Rahimi, A, et al.
Publicat: (2024)

The Effect of Visual Word Segmentation Cues in Tibetan Reading
per: Danhui Wang, et al.
Publicat: (2024-09-01)

The effect of attention shifting on Chinese children’s word reading in primary school
per: Hui Zhou, et al.
Publicat: (2024-02-01)

Comorbidity of Auditory Processing, Attention, and Memory in Children With Word Reading Difficulties
per: Rakshita Gokula, et al.
Publicat: (2019-10-01)

Reading Abilities: Importance of Visual-Spatial Attention
per: Gabrieli, John D. E., et al.
Publicat: (2014)

Out of time: automated lip sync in the wild
per: Chung, J, et al.
Publicat: (2017)

Cantonese sentence dataset for lip‐reading
per: Yewei Xiao, et al.
Publicat: (2024-08-01)