An image is worth 1000 lies: adversarial transferability across prompts on vision-language models
Different from traditional task-specific vision models, recent large VLMs can readily adapt to different vision tasks by simply using different textual instructions, i.e., prompts. However, a well-known concern about traditional task-specific vision models is that they can be misled by imperceptible...
Main Authors: | , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
OpenReview
2024
|