AutoAD-Zero: a training-free framework for zero-shot audio description

Our objective is to generate Audio Descriptions (ADs) for both movies and TV series in a training-free manner. We use the power of off-the-shelf Visual-Language Models (VLMs) and Large Language Models (LLMs), and develop visual and text prompting strategies for this task. Our contributions are three...

全面介绍

书目详细资料
Main Authors:	Xie, J, Han, T, Bain, M, Nagrani, A, Varol, G, Xie, W, Zisserman, A
格式:	Conference item
语言:	English
出版:	Springer 2024

AutoAD-Zero: a training-free framework for zero-shot audio description

相似书籍