My lips are concealed: audio-visual speech enhancement through obstructions
Our objective is an audio-visual model for separating a single speaker from a mixture of sounds such as other speakers and background noise. Moreover, we wish to hear the speaker even when the visual cues are temporarily absent due to occlusion. To this end we introduce a deep audio-visual speech e...
Main Authors: | Afouras, T, Chung, JS, Zisserman, A |
---|---|
Format: | Conference item |
Published: |
ISCA
2019
|
Similar Items
-
The conversation: deep audio-visual speech enhancement
by: Alfouras, T, et al.
Published: (2018) -
Deep audio-visual speech recognition
by: Afouras, T, et al.
Published: (2018) -
Speech recognition models are strong lip-readers
by: Prajwal, KR, et al.
Published: (2024) -
Now you're speaking my language: visual language identification
by: Afouras, T, et al.
Published: (2020) -
Self-supervised learning of audio-visual objects from video
by: Afouras, T, et al.
Published: (2020)