As firm as their foundations: can open-sourced foundation models be used to create adversarial examples for downstream tasks?
Foundation models pre-trained on web-scale vision-language data, such as CLIP, are widely used as cornerstones of powerful machine learning systems. While pre-training offers clear advantages for downstream learning, it also endows downstream models with shared adversarial vulnerabilities that can b...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
2024
|
_version_ | 1811141022796742656 |
---|---|
author | Hu, A Gu, J Pinto, F Kamnitsas, K Torr, P |
author_facet | Hu, A Gu, J Pinto, F Kamnitsas, K Torr, P |
author_sort | Hu, A |
collection | OXFORD |
description | Foundation models pre-trained on web-scale vision-language data, such as CLIP, are widely used as cornerstones of powerful machine learning systems. While pre-training offers clear advantages for downstream learning, it also endows downstream models with shared adversarial vulnerabilities that can be easily identified through the open-sourced foundation model. In this work, we expose such vulnerabilities among CLIP’s downstream models and show that foundation models can serve as a basis for attacking their downstream systems. In particular, we propose a simple yet alarmingly effective adversarial attack strategy termed Patch Representation Misalignment (PRM). Solely based on open-sourced CLIP vision encoders, this method can produce highly effective adversaries that simultaneously fool more than 20 downstream models spanning 4 common vision-language tasks (semantic segmentation, object detection, image captioning and visual question-answering). Our findings highlight the concerning safety risks introduced by the extensive usage of publicly available foundational models in the development of downstream systems, calling for extra caution in these scenarios. |
first_indexed | 2024-09-25T04:31:16Z |
format | Conference item |
id | oxford-uuid:087a52c9-6f48-4b09-a125-dcdd773da976 |
institution | University of Oxford |
language | English |
last_indexed | 2024-09-25T04:31:16Z |
publishDate | 2024 |
record_format | dspace |
spelling | oxford-uuid:087a52c9-6f48-4b09-a125-dcdd773da9762024-08-29T10:36:01ZAs firm as their foundations: can open-sourced foundation models be used to create adversarial examples for downstream tasks?Conference itemhttp://purl.org/coar/resource_type/c_5794uuid:087a52c9-6f48-4b09-a125-dcdd773da976EnglishSymplectic Elements2024Hu, AGu, JPinto, FKamnitsas, KTorr, PFoundation models pre-trained on web-scale vision-language data, such as CLIP, are widely used as cornerstones of powerful machine learning systems. While pre-training offers clear advantages for downstream learning, it also endows downstream models with shared adversarial vulnerabilities that can be easily identified through the open-sourced foundation model. In this work, we expose such vulnerabilities among CLIP’s downstream models and show that foundation models can serve as a basis for attacking their downstream systems. In particular, we propose a simple yet alarmingly effective adversarial attack strategy termed Patch Representation Misalignment (PRM). Solely based on open-sourced CLIP vision encoders, this method can produce highly effective adversaries that simultaneously fool more than 20 downstream models spanning 4 common vision-language tasks (semantic segmentation, object detection, image captioning and visual question-answering). Our findings highlight the concerning safety risks introduced by the extensive usage of publicly available foundational models in the development of downstream systems, calling for extra caution in these scenarios. |
spellingShingle | Hu, A Gu, J Pinto, F Kamnitsas, K Torr, P As firm as their foundations: can open-sourced foundation models be used to create adversarial examples for downstream tasks? |
title | As firm as their foundations: can open-sourced foundation models be used to create adversarial examples for downstream tasks? |
title_full | As firm as their foundations: can open-sourced foundation models be used to create adversarial examples for downstream tasks? |
title_fullStr | As firm as their foundations: can open-sourced foundation models be used to create adversarial examples for downstream tasks? |
title_full_unstemmed | As firm as their foundations: can open-sourced foundation models be used to create adversarial examples for downstream tasks? |
title_short | As firm as their foundations: can open-sourced foundation models be used to create adversarial examples for downstream tasks? |
title_sort | as firm as their foundations can open sourced foundation models be used to create adversarial examples for downstream tasks |
work_keys_str_mv | AT hua asfirmastheirfoundationscanopensourcedfoundationmodelsbeusedtocreateadversarialexamplesfordownstreamtasks AT guj asfirmastheirfoundationscanopensourcedfoundationmodelsbeusedtocreateadversarialexamplesfordownstreamtasks AT pintof asfirmastheirfoundationscanopensourcedfoundationmodelsbeusedtocreateadversarialexamplesfordownstreamtasks AT kamnitsask asfirmastheirfoundationscanopensourcedfoundationmodelsbeusedtocreateadversarialexamplesfordownstreamtasks AT torrp asfirmastheirfoundationscanopensourcedfoundationmodelsbeusedtocreateadversarialexamplesfordownstreamtasks |