My lips are concealed: audio-visual speech enhancement through obstructions

My lips are concealed: audio-visual speech enhancement through obstructions

Our objective is an audio-visual model for separating a single speaker from a mixture of sounds such as other speakers and background noise. Moreover, we wish to hear the speaker even when the visual cues are temporarily absent due to occlusion. To this end we introduce a deep audio-visual speech e...

Full description

Bibliographic Details
Main Authors:	Afouras, T, Chung, JS, Zisserman, A
Format:	Conference item
Published:	ISCA 2019

Similar Items

The conversation: deep audio-visual speech enhancement
by: Alfouras, T, et al.
Published: (2018)

Deep audio-visual speech recognition
by: Afouras, T, et al.
Published: (2018)

Speech recognition models are strong lip-readers
by: Prajwal, KR, et al.
Published: (2024)

Now you're speaking my language: visual language identification
by: Afouras, T, et al.
Published: (2020)

Self-supervised learning of audio-visual objects from video
by: Afouras, T, et al.
Published: (2020)

ASR is all you need: cross-modal distillation for lip reading
by: Afouras, T, et al.
Published: (2020)

Sub-word level lip reading with visual attention
by: Prajwal, KR, et al.
Published: (2022)

A novel lip geometry approach for audio-visual speech recognition
by: Mohd Zamri, Ibrahim
Published: (2014)

Deep lip reading: a comparison of models and an online application
by: Afouras, T, et al.
Published: (2018)

Audio-visual deep learning
by: Afouras, T, et al.
Published: (2021)

A lip geometry approach for feature-fusion based audio-visual speech recognition
by: M. Z., Ibrahim, et al.
Published: (2014)

Audio-visual synchronisation in the wild
by: Chen, H, et al.
Published: (2021)

Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment
by: M. Z., Ibrahim, et al.
Published: (2015)

Seeing wake words: Audio-visual keyword spotting
by: Momeni, L, et al.
Published: (2020)

Perceptual watermarking and data concealment in audio signals
by: Tio, Cedric Meng Meng.
Published: (2008)

Perceptual watermarking and data concealment in audio signals
by: McLoughlin, Ian.
Published: (2008)

Lip reading in the wild
by: Chung, J, et al.
Published: (2017)

Lip reading in profile
by: Chung, J, et al.
Published: (2017)

3D lips development and measurement for visual speech synthesis
by: Salleh, Siti Salwa, et al.
Published: (2009)

Hardware architecture for perceptual watermarking and data concealment in audio signals
by: Robertus Wahendro Ali.
Published: (2008)

Visual perception. Contours revealed by concealment.
by: Braddick, O
Published: (1988)

A review of audio-visual speech recognition
by: Thum, Wei Seong, et al.
Published: (2018)

Lip Reading Sentences in the Wild
by: Chung, J, et al.
Published: (2017)

Learning to lip read words by watching videos
by: Chung, J, et al.
Published: (2018)

Out of time: automated lip sync in the wild
by: Chung, J, et al.
Published: (2017)

Statistical modeling and analysis of audio-visual association in speech
by: Siracusa, Michael Richard, 1980-
Published: (2006)

Self-Supervised Audio-Visual Speech Diarization and Recognition
by: Wongprommoon, Arun
Published: (2024)

Cortical operational synchrony during audio-visual speech integration.
by: Fingelkurts, A, et al.
Published: (2003)

You said that?: Synthesising talking faces from audio
by: Jamaludin, A, et al.
Published: (2019)

Reading to listen at the cocktail party: multi-modal speech separation
by: Rahimi, A, et al.
Published: (2022)

Character-aware audio-visual subtitling in context
by: Huh, J, et al.
Published: (2024)

Result comparison of model validation techniques on audio-visual speech recognition
by: Thum, Wei Seong, et al.
Published: (2017)

Development of audio-visual speech recognition using deep-learning technique
by: How, Chun Kit, et al.
Published: (2022)

FMRI studies of effects of hearing status on audio-visual speech perception
by: Yoo, Julie J
Published: (2008)

WhisperX: time-accurate speech transcription of long-form audio
by: Bain, M, et al.
Published: (2023)

On the interpretation of concealed questions
by: Nathan, Lance Edward
Published: (2007)

Infrastructure development for integration of lip reading into the SUMMIT Speech Recognizer
by: La, Chia-Hao, 1980-
Published: (2006)

Digital audio/speech forensics
by: Tok, De Fang.
Published: (2010)

Video and audio analysis for the detection of Obstructive Sleep Apnoea
by: Gederi, E
Published: (2017)

Audio-visual modelling in a clinical setting
by: Jiao, J, et al.
Published: (2024)