As firm as their foundations: creating transferable adversarial examples across downstream tasks with CLIP
Foundation models pre-trained on web-scale vision-language data, such as CLIP, are widely used as cornerstones of powerful machine learning systems. While pre-training offers clear advantages for downstream learning, it also endows downstream models with shared adversarial vulnerabilities that can b...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
British Machine Vision Association
2024
|