AutoAD-Zero: a training-free framework for zero-shot audio description

Our objective is to generate Audio Descriptions (ADs) for both movies and TV series in a training-free manner. We use the power of off-the-shelf Visual-Language Models (VLMs) and Large Language Models (LLMs), and develop visual and text prompting strategies for this task. Our contributions are three...

Full description

Bibliographic Details
Main Authors: Xie, J, Han, T, Bain, M, Nagrani, A, Varol, G, Xie, W, Zisserman, A
Format: Conference item
Language:English
Published: Springer 2024