Adversarial robustness without perturbations

Models resistant to adversarial perturbations are stable around the neighbourhoods of input images, such that small changes, known as adversarial attacks, cannot dramatically change the prediction. Currently, this stability is obtained with Adversarial Training, which directly teaches models to be r...

Full description

Bibliographic Details
Main Author: Rodríguez Muñoz, Adrán
Other Authors: Torralba, Antonio
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156344