Make some noise: reliable and efficient single-step adversarial training

Recently, Wong et al. (2020) showed that adversarial training with single-step FGSM leads to a characteristic failure mode named catastrophic overfitting (CO), in which a model becomes suddenly vulnerable to multi-step attacks. Experimentally they showed that simply adding a random perturbation prio...

Cur síos iomlán

Sonraí bibleagrafaíochta
Príomhchruthaitheoirí: de Jorge, P, Bibi, A, Volpi, R, Sanyal, A, Torr, PHS, Rogez, G, Dokania, PK
Formáid: Conference item
Teanga:English
Foilsithe / Cruthaithe: Curran Associates 2023