Adversarial Examples in Simpler Settings

In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude...

Full description

Bibliographic Details
Main Author:	Wang, Tony T.
Other Authors:	Wornell, Gregory W.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/139041

_version_	1811077980679569408
author	Wang, Tony T.
author2	Wornell, Gregory W.
author_facet	Wornell, Gregory W. Wang, Tony T.
author_sort	Wang, Tony T.
collection	MIT
description	In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude from this that in order to learn a robust classifier, one should explicitly aim for it either via a good choice of model family or via optimizing explicitly for robust accuracy. On the empirical front we discover that kernel classifiers and neural networks are non-robust in similar ways. This suggests that a better understanding of kernel classifier robustness may help unravel some of the mysteries of adversarial examples.
first_indexed	2024-09-23T10:51:30Z
format	Thesis
id	mit-1721.1/139041
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T10:51:30Z
publishDate	2022
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1390412022-01-15T03:47:54Z Adversarial Examples in Simpler Settings Wang, Tony T. Wornell, Gregory W. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude from this that in order to learn a robust classifier, one should explicitly aim for it either via a good choice of model family or via optimizing explicitly for robust accuracy. On the empirical front we discover that kernel classifiers and neural networks are non-robust in similar ways. This suggests that a better understanding of kernel classifier robustness may help unravel some of the mysteries of adversarial examples. M.Eng. 2022-01-14T14:46:20Z 2022-01-14T14:46:20Z 2021-06 2021-06-17T20:14:40.060Z Thesis https://hdl.handle.net/1721.1/139041 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Wang, Tony T. Adversarial Examples in Simpler Settings
title	Adversarial Examples in Simpler Settings
title_full	Adversarial Examples in Simpler Settings
title_fullStr	Adversarial Examples in Simpler Settings
title_full_unstemmed	Adversarial Examples in Simpler Settings
title_short	Adversarial Examples in Simpler Settings
title_sort	adversarial examples in simpler settings
url	https://hdl.handle.net/1721.1/139041
work_keys_str_mv	AT wangtonyt adversarialexamplesinsimplersettings

Adversarial Examples in Simpler Settings

Similar Items