Adversarial Examples in Simpler Settings
In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2022
|
Online Access: | https://hdl.handle.net/1721.1/139041 |
_version_ | 1811077980679569408 |
---|---|
author | Wang, Tony T. |
author2 | Wornell, Gregory W. |
author_facet | Wornell, Gregory W. Wang, Tony T. |
author_sort | Wang, Tony T. |
collection | MIT |
description | In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude from this that in order to learn a robust classifier, one should explicitly aim for it either via a good choice of model family or via optimizing explicitly for robust accuracy. On the empirical front we discover that kernel classifiers and neural networks are non-robust in similar ways. This suggests that a better understanding of kernel classifier robustness may help unravel some of the mysteries of adversarial examples. |
first_indexed | 2024-09-23T10:51:30Z |
format | Thesis |
id | mit-1721.1/139041 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T10:51:30Z |
publishDate | 2022 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1390412022-01-15T03:47:54Z Adversarial Examples in Simpler Settings Wang, Tony T. Wornell, Gregory W. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude from this that in order to learn a robust classifier, one should explicitly aim for it either via a good choice of model family or via optimizing explicitly for robust accuracy. On the empirical front we discover that kernel classifiers and neural networks are non-robust in similar ways. This suggests that a better understanding of kernel classifier robustness may help unravel some of the mysteries of adversarial examples. M.Eng. 2022-01-14T14:46:20Z 2022-01-14T14:46:20Z 2021-06 2021-06-17T20:14:40.060Z Thesis https://hdl.handle.net/1721.1/139041 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Wang, Tony T. Adversarial Examples in Simpler Settings |
title | Adversarial Examples in Simpler Settings |
title_full | Adversarial Examples in Simpler Settings |
title_fullStr | Adversarial Examples in Simpler Settings |
title_full_unstemmed | Adversarial Examples in Simpler Settings |
title_short | Adversarial Examples in Simpler Settings |
title_sort | adversarial examples in simpler settings |
url | https://hdl.handle.net/1721.1/139041 |
work_keys_str_mv | AT wangtonyt adversarialexamplesinsimplersettings |