Towards ML Models That We Can Deploy Confidently

As machine learning (ML) systems are deployed in the real world, the reliability and trustworthiness of these systems become an even more salient challenge. This thesis aims to address this challenge through two key thrusts: (1) making ML models more trustworthy by leveraging what has been perceived...

Full description

Bibliographic Details
Main Author: Salman, Hadi
Other Authors: Madry, Aleksander
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/152859
https://orcid.org/0009-0008-9611-4702
_version_ 1826189808435724288
author Salman, Hadi
author2 Madry, Aleksander
author_facet Madry, Aleksander
Salman, Hadi
author_sort Salman, Hadi
collection MIT
description As machine learning (ML) systems are deployed in the real world, the reliability and trustworthiness of these systems become an even more salient challenge. This thesis aims to address this challenge through two key thrusts: (1) making ML models more trustworthy by leveraging what has been perceived solely as a weakness of ML model—adversarial perturbations, and (2) exploring the underpinnings of reliable ML deployment. Specifically, in the first thrust, we focus on adversarial perturbations, which constitute a well-known threat to integrity of ML models, and show how to build ML models that are robust to so-called adversarial patches. We then show that adversarial perturbations can be repurposed to not just be a weakness of ML models but rather to bolster these models’ resilience and reliability. To this end, we leverage these perturbations to, first, develop a way to create objects that are easier for ML models to recognize, then to devise a way to safeguard images against unwanted AI-powered alterations, and finally to improve transfer learning performance. The second thrust of this thesis revolves around ML model interpretability and debugging so as to ensure safety, equitability, and unbiased decision-making of ML systems. In particular, we investigate methods for building ML models that are more debuggable and provide tools for diagnosing their failure modes. We then study how data affects model behavior, identify unexpected ways in which data might introduce biases into ML models, particularly in the context of transfer learning. Finally, we put forth a data-based framework for studying transfer learning which can help us discover problematic biases inherited from pretraining data.
first_indexed 2024-09-23T08:22:44Z
format Thesis
id mit-1721.1/152859
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T08:22:44Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1528592023-11-03T04:05:29Z Towards ML Models That We Can Deploy Confidently Salman, Hadi Madry, Aleksander Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science As machine learning (ML) systems are deployed in the real world, the reliability and trustworthiness of these systems become an even more salient challenge. This thesis aims to address this challenge through two key thrusts: (1) making ML models more trustworthy by leveraging what has been perceived solely as a weakness of ML model—adversarial perturbations, and (2) exploring the underpinnings of reliable ML deployment. Specifically, in the first thrust, we focus on adversarial perturbations, which constitute a well-known threat to integrity of ML models, and show how to build ML models that are robust to so-called adversarial patches. We then show that adversarial perturbations can be repurposed to not just be a weakness of ML models but rather to bolster these models’ resilience and reliability. To this end, we leverage these perturbations to, first, develop a way to create objects that are easier for ML models to recognize, then to devise a way to safeguard images against unwanted AI-powered alterations, and finally to improve transfer learning performance. The second thrust of this thesis revolves around ML model interpretability and debugging so as to ensure safety, equitability, and unbiased decision-making of ML systems. In particular, we investigate methods for building ML models that are more debuggable and provide tools for diagnosing their failure modes. We then study how data affects model behavior, identify unexpected ways in which data might introduce biases into ML models, particularly in the context of transfer learning. Finally, we put forth a data-based framework for studying transfer learning which can help us discover problematic biases inherited from pretraining data. Ph.D. 2023-11-02T20:22:53Z 2023-11-02T20:22:53Z 2023-09 2023-09-21T14:25:54.648Z Thesis https://hdl.handle.net/1721.1/152859 https://orcid.org/0009-0008-9611-4702 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Salman, Hadi
Towards ML Models That We Can Deploy Confidently
title Towards ML Models That We Can Deploy Confidently
title_full Towards ML Models That We Can Deploy Confidently
title_fullStr Towards ML Models That We Can Deploy Confidently
title_full_unstemmed Towards ML Models That We Can Deploy Confidently
title_short Towards ML Models That We Can Deploy Confidently
title_sort towards ml models that we can deploy confidently
url https://hdl.handle.net/1721.1/152859
https://orcid.org/0009-0008-9611-4702
work_keys_str_mv AT salmanhadi towardsmlmodelsthatwecandeployconfidently