Look, listen and recognise: character-aware audio-visual subtitling

The goal of this paper is automatic character-aware subtitle generation. Given a video and a minimal amount of metadata, we propose an audio-visual method that generates a full transcript of the dialogue, with precise speech timestamps, and the character speaking identified. The key idea is to first...

Full description

Bibliographic Details
Main Authors: Korbar, B, Huh, J, Zisserman, A
Format: Conference item
Language:English
Published: IEEE 2024