An image is worth 1000 lies: adversarial transferability across prompts on vision-language models

Different from traditional task-specific vision models, recent large VLMs can readily adapt to different vision tasks by simply using different textual instructions, i.e., prompts. However, a well-known concern about traditional task-specific vision models is that they can be misled by imperceptible...

Full description

Bibliographic Details
Main Authors: Luo, H, Gu, J, Liu, F, Torr, P
Format: Conference item
Language:English
Published: OpenReview 2024