AutoAD-Zero: a training-free framework for zero-shot audio description

Our objective is to generate Audio Descriptions (ADs) for both movies and TV series in a training-free manner. We use the power of off-the-shelf Visual-Language Models (VLMs) and Large Language Models (LLMs), and develop visual and text prompting strategies for this task. Our contributions are three...

Full description

Bibliographic Details
Main Authors:	Xie, J, Han, T, Bain, M, Nagrani, A, Varol, G, Xie, W, Zisserman, A
Format:	Conference item
Language:	English
Published:	Springer 2024

AutoAD-Zero: a training-free framework for zero-shot audio description

Similar Items