Adversarial Examples in Simpler Settings

In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude...

Full description

Bibliographic Details
Main Author: Wang, Tony T.
Other Authors: Wornell, Gregory W.
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139041
_version_ 1811077980679569408
author Wang, Tony T.
author2 Wornell, Gregory W.
author_facet Wornell, Gregory W.
Wang, Tony T.
author_sort Wang, Tony T.
collection MIT
description In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude from this that in order to learn a robust classifier, one should explicitly aim for it either via a good choice of model family or via optimizing explicitly for robust accuracy. On the empirical front we discover that kernel classifiers and neural networks are non-robust in similar ways. This suggests that a better understanding of kernel classifier robustness may help unravel some of the mysteries of adversarial examples.
first_indexed 2024-09-23T10:51:30Z
format Thesis
id mit-1721.1/139041
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T10:51:30Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1390412022-01-15T03:47:54Z Adversarial Examples in Simpler Settings Wang, Tony T. Wornell, Gregory W. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude from this that in order to learn a robust classifier, one should explicitly aim for it either via a good choice of model family or via optimizing explicitly for robust accuracy. On the empirical front we discover that kernel classifiers and neural networks are non-robust in similar ways. This suggests that a better understanding of kernel classifier robustness may help unravel some of the mysteries of adversarial examples. M.Eng. 2022-01-14T14:46:20Z 2022-01-14T14:46:20Z 2021-06 2021-06-17T20:14:40.060Z Thesis https://hdl.handle.net/1721.1/139041 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Wang, Tony T.
Adversarial Examples in Simpler Settings
title Adversarial Examples in Simpler Settings
title_full Adversarial Examples in Simpler Settings
title_fullStr Adversarial Examples in Simpler Settings
title_full_unstemmed Adversarial Examples in Simpler Settings
title_short Adversarial Examples in Simpler Settings
title_sort adversarial examples in simpler settings
url https://hdl.handle.net/1721.1/139041
work_keys_str_mv AT wangtonyt adversarialexamplesinsimplersettings