Purify unlearnable examples via rate-constrained variational autoencoders
Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Defenses against these poisoning attacks can be categorized based on whether specific interventions are adopted during training. The first approach is training-ti...
Main Authors: | , , , , , , |
---|---|
Other Authors: | |
Format: | Conference Paper |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/178531 https://proceedings.mlr.press/v235/ https://icml.cc/ |
_version_ | 1826114113851359232 |
---|---|
author | Yu, Yi Wang, Yufei Xia, Song Yang, Wenhan Lu, Shijian Tan, Yap Peng Kot, Alex Chichung |
author2 | Interdisciplinary Graduate School (IGS) |
author_facet | Interdisciplinary Graduate School (IGS) Yu, Yi Wang, Yufei Xia, Song Yang, Wenhan Lu, Shijian Tan, Yap Peng Kot, Alex Chichung |
author_sort | Yu, Yi |
collection | NTU |
description | Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Defenses against these poisoning attacks can be categorized based on whether specific interventions are adopted during training. The first approach is training-time defense, such as adversarial training, which can mitigate poisoning effects but is computationally intensive. The other approach is pre-training purification, e.g., image short squeezing, which consists of several simple compressions but often encounters challenges in dealing with various UEs. Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method. Firstly, we uncover rate-constrained variational autoencoders (VAEs), demonstrating a clear tendency to suppress the perturbations in UEs. We subsequently conduct a theoretical analysis for this phenomenon. Building upon these insights, we introduce a disentangle variational autoencoder (D- VAE), capable of disentangling the perturbations with learnable class-wise embeddings. Based on this network, a two-stage purification approach is naturally developed. The first stage focuses on roughly eliminating perturbations, while the second stage produces refined, poison-free results, ensuring effectiveness and robustness across various scenarios. Extensive experiments demonstrate the remarkable performance of our method across CIFAR-10, CIFAR-100, and a 100-class ImageNet-subset. Code is available at https://github.com/yuyi-sd/D-VAE. |
first_indexed | 2024-10-01T03:33:52Z |
format | Conference Paper |
id | ntu-10356/178531 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T03:33:52Z |
publishDate | 2024 |
record_format | dspace |
spelling | ntu-10356/1785312024-08-04T15:36:24Z Purify unlearnable examples via rate-constrained variational autoencoders Yu, Yi Wang, Yufei Xia, Song Yang, Wenhan Lu, Shijian Tan, Yap Peng Kot, Alex Chichung Interdisciplinary Graduate School (IGS) School of Electrical and Electronic Engineering School of Computer Science and Engineering 41st International Conference on Machine Learning (ICML 2024) Rapid-Rich Object Search (ROSE) Lab Computer and Information Science Unlearnable examples Defense Poisoning attacks Indiscriminate poisoning attacks Purification Unlearnable datasets Availibility poisoning attacks Availibility Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Defenses against these poisoning attacks can be categorized based on whether specific interventions are adopted during training. The first approach is training-time defense, such as adversarial training, which can mitigate poisoning effects but is computationally intensive. The other approach is pre-training purification, e.g., image short squeezing, which consists of several simple compressions but often encounters challenges in dealing with various UEs. Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method. Firstly, we uncover rate-constrained variational autoencoders (VAEs), demonstrating a clear tendency to suppress the perturbations in UEs. We subsequently conduct a theoretical analysis for this phenomenon. Building upon these insights, we introduce a disentangle variational autoencoder (D- VAE), capable of disentangling the perturbations with learnable class-wise embeddings. Based on this network, a two-stage purification approach is naturally developed. The first stage focuses on roughly eliminating perturbations, while the second stage produces refined, poison-free results, ensuring effectiveness and robustness across various scenarios. Extensive experiments demonstrate the remarkable performance of our method across CIFAR-10, CIFAR-100, and a 100-class ImageNet-subset. Code is available at https://github.com/yuyi-sd/D-VAE. Nanyang Technological University Published version This research is supported in part by the NTU-PKU Joint Research Institute and the DSO National Laboratories, Singapore, under the project agreement No. DSOCL22332. 2024-07-30T02:29:16Z 2024-07-30T02:29:16Z 2024 Conference Paper Yu, Y., Wang, Y., Xia, S., Yang, W., Lu, S., Tan, Y. P. & Kot, A. C. (2024). Purify unlearnable examples via rate-constrained variational autoencoders. 41st International Conference on Machine Learning (ICML 2024), PMLR 235, 1-25. https://hdl.handle.net/10356/178531 https://proceedings.mlr.press/v235/ https://icml.cc/ PMLR 235 1 25 en DSOCL22332 © 2024 The Author(s). All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at https://proceedings.mlr.press/v235/. application/pdf |
spellingShingle | Computer and Information Science Unlearnable examples Defense Poisoning attacks Indiscriminate poisoning attacks Purification Unlearnable datasets Availibility poisoning attacks Availibility Yu, Yi Wang, Yufei Xia, Song Yang, Wenhan Lu, Shijian Tan, Yap Peng Kot, Alex Chichung Purify unlearnable examples via rate-constrained variational autoencoders |
title | Purify unlearnable examples via rate-constrained variational autoencoders |
title_full | Purify unlearnable examples via rate-constrained variational autoencoders |
title_fullStr | Purify unlearnable examples via rate-constrained variational autoencoders |
title_full_unstemmed | Purify unlearnable examples via rate-constrained variational autoencoders |
title_short | Purify unlearnable examples via rate-constrained variational autoencoders |
title_sort | purify unlearnable examples via rate constrained variational autoencoders |
topic | Computer and Information Science Unlearnable examples Defense Poisoning attacks Indiscriminate poisoning attacks Purification Unlearnable datasets Availibility poisoning attacks Availibility |
url | https://hdl.handle.net/10356/178531 https://proceedings.mlr.press/v235/ https://icml.cc/ |
work_keys_str_mv | AT yuyi purifyunlearnableexamplesviarateconstrainedvariationalautoencoders AT wangyufei purifyunlearnableexamplesviarateconstrainedvariationalautoencoders AT xiasong purifyunlearnableexamplesviarateconstrainedvariationalautoencoders AT yangwenhan purifyunlearnableexamplesviarateconstrainedvariationalautoencoders AT lushijian purifyunlearnableexamplesviarateconstrainedvariationalautoencoders AT tanyappeng purifyunlearnableexamplesviarateconstrainedvariationalautoencoders AT kotalexchichung purifyunlearnableexamplesviarateconstrainedvariationalautoencoders |