Learning Through the Lens of Robustness

Despite their impressive performance on large-scale benchmarks, machine learning sys- tems turn out to be quite brittle outside of the exact setting in which they were developed. How can we build ML models that are robust and reliable enough for real-world deployment? To answer this question,...

Full description

Bibliographic Details
Main Author: Tsipras, Dimitris
Other Authors: Madry, Aleksander
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/140148
_version_ 1826191484911616000
author Tsipras, Dimitris
author2 Madry, Aleksander
author_facet Madry, Aleksander
Tsipras, Dimitris
author_sort Tsipras, Dimitris
collection MIT
description Despite their impressive performance on large-scale benchmarks, machine learning sys- tems turn out to be quite brittle outside of the exact setting in which they were developed. How can we build ML models that are robust and reliable enough for real-world deployment? To answer this question, we first focus on training models that are robust to small, worst-case perturbations of their input. Specifically, we consider the framework of robust optimization and study how these tools can be leveraged in the context of modern ML models. As it turns out, this approach leads us to the first deep learning models that are robust to a wide range of (small) perturbations on realistic datasets. Next, we explore how such a paradigm of adversarially robust learning differs from the standard learning setting. As we will see, robust learning may require training a model that relies on a fundamentally different set of input features. In fact, this requirement can give rise to a trade-off between robustness and accuracy. At the same time, the features that robust models rely on turn out to be more aligned with human perception and, in turn, make these models also useful outside the context of reliability. Finally, we move beyond the worst-case perturbation setting and investigate other robustness challenges in deploying models in the wild. On one hand, we develop general methodologies for creating benchmarks that gauge model robustness along a variety of axes, such as subpopulation shift and concept transformations. On the other hand, we explore ways to improve the reliability of our models during deployment. To this end, we study how we can bias the features that a model learns towards features that generalize to new environments. Moreover, we develop a methodology that allows us to directly rewrite the prediction rules of a model with virtually no additional data collection.
first_indexed 2024-09-23T08:56:37Z
format Thesis
id mit-1721.1/140148
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T08:56:37Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1401482022-02-08T04:00:27Z Learning Through the Lens of Robustness Tsipras, Dimitris Madry, Aleksander Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Despite their impressive performance on large-scale benchmarks, machine learning sys- tems turn out to be quite brittle outside of the exact setting in which they were developed. How can we build ML models that are robust and reliable enough for real-world deployment? To answer this question, we first focus on training models that are robust to small, worst-case perturbations of their input. Specifically, we consider the framework of robust optimization and study how these tools can be leveraged in the context of modern ML models. As it turns out, this approach leads us to the first deep learning models that are robust to a wide range of (small) perturbations on realistic datasets. Next, we explore how such a paradigm of adversarially robust learning differs from the standard learning setting. As we will see, robust learning may require training a model that relies on a fundamentally different set of input features. In fact, this requirement can give rise to a trade-off between robustness and accuracy. At the same time, the features that robust models rely on turn out to be more aligned with human perception and, in turn, make these models also useful outside the context of reliability. Finally, we move beyond the worst-case perturbation setting and investigate other robustness challenges in deploying models in the wild. On one hand, we develop general methodologies for creating benchmarks that gauge model robustness along a variety of axes, such as subpopulation shift and concept transformations. On the other hand, we explore ways to improve the reliability of our models during deployment. To this end, we study how we can bias the features that a model learns towards features that generalize to new environments. Moreover, we develop a methodology that allows us to directly rewrite the prediction rules of a model with virtually no additional data collection. Ph.D. 2022-02-07T15:26:50Z 2022-02-07T15:26:50Z 2021-09 2021-09-21T19:31:06.986Z Thesis https://hdl.handle.net/1721.1/140148 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Tsipras, Dimitris
Learning Through the Lens of Robustness
title Learning Through the Lens of Robustness
title_full Learning Through the Lens of Robustness
title_fullStr Learning Through the Lens of Robustness
title_full_unstemmed Learning Through the Lens of Robustness
title_short Learning Through the Lens of Robustness
title_sort learning through the lens of robustness
url https://hdl.handle.net/1721.1/140148
work_keys_str_mv AT tsiprasdimitris learningthroughthelensofrobustness