Make some noise: reliable and efficient single-step adversarial training
Recently, Wong et al. (2020) showed that adversarial training with single-step FGSM leads to a characteristic failure mode named catastrophic overfitting (CO), in which a model becomes suddenly vulnerable to multi-step attacks. Experimentally they showed that simply adding a random perturbation prio...
Príomhchruthaitheoirí: | , , , , , , |
---|---|
Formáid: | Conference item |
Teanga: | English |
Foilsithe / Cruthaithe: |
Curran Associates
2023
|