My lips are concealed: audio-visual speech enhancement through obstructions
Our objective is an audio-visual model for separating a single speaker from a mixture of sounds such as other speakers and background noise. Moreover, we wish to hear the speaker even when the visual cues are temporarily absent due to occlusion. To this end we introduce a deep audio-visual speech e...
Main Authors: | , , |
---|---|
Format: | Conference item |
Published: |
ISCA
2019
|