Towards ML Models That We Can Deploy Confidently

As machine learning (ML) systems are deployed in the real world, the reliability and trustworthiness of these systems become an even more salient challenge. This thesis aims to address this challenge through two key thrusts: (1) making ML models more trustworthy by leveraging what has been perceived...

Full description

Bibliographic Details
Main Author:	Salman, Hadi
Other Authors:	Madry, Aleksander
Format:	Thesis
Published:	Massachusetts Institute of Technology 2023
Online Access:	https://hdl.handle.net/1721.1/152859 https://orcid.org/0009-0008-9611-4702

_version_	1826189808435724288
author	Salman, Hadi
author2	Madry, Aleksander
author_facet	Madry, Aleksander Salman, Hadi
author_sort	Salman, Hadi
collection	MIT
description	As machine learning (ML) systems are deployed in the real world, the reliability and trustworthiness of these systems become an even more salient challenge. This thesis aims to address this challenge through two key thrusts: (1) making ML models more trustworthy by leveraging what has been perceived solely as a weakness of ML model—adversarial perturbations, and (2) exploring the underpinnings of reliable ML deployment. Specifically, in the first thrust, we focus on adversarial perturbations, which constitute a well-known threat to integrity of ML models, and show how to build ML models that are robust to so-called adversarial patches. We then show that adversarial perturbations can be repurposed to not just be a weakness of ML models but rather to bolster these models’ resilience and reliability. To this end, we leverage these perturbations to, first, develop a way to create objects that are easier for ML models to recognize, then to devise a way to safeguard images against unwanted AI-powered alterations, and finally to improve transfer learning performance. The second thrust of this thesis revolves around ML model interpretability and debugging so as to ensure safety, equitability, and unbiased decision-making of ML systems. In particular, we investigate methods for building ML models that are more debuggable and provide tools for diagnosing their failure modes. We then study how data affects model behavior, identify unexpected ways in which data might introduce biases into ML models, particularly in the context of transfer learning. Finally, we put forth a data-based framework for studying transfer learning which can help us discover problematic biases inherited from pretraining data.
first_indexed	2024-09-23T08:22:44Z
format	Thesis
id	mit-1721.1/152859
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T08:22:44Z
publishDate	2023
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1528592023-11-03T04:05:29Z Towards ML Models That We Can Deploy Confidently Salman, Hadi Madry, Aleksander Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science As machine learning (ML) systems are deployed in the real world, the reliability and trustworthiness of these systems become an even more salient challenge. This thesis aims to address this challenge through two key thrusts: (1) making ML models more trustworthy by leveraging what has been perceived solely as a weakness of ML model—adversarial perturbations, and (2) exploring the underpinnings of reliable ML deployment. Specifically, in the first thrust, we focus on adversarial perturbations, which constitute a well-known threat to integrity of ML models, and show how to build ML models that are robust to so-called adversarial patches. We then show that adversarial perturbations can be repurposed to not just be a weakness of ML models but rather to bolster these models’ resilience and reliability. To this end, we leverage these perturbations to, first, develop a way to create objects that are easier for ML models to recognize, then to devise a way to safeguard images against unwanted AI-powered alterations, and finally to improve transfer learning performance. The second thrust of this thesis revolves around ML model interpretability and debugging so as to ensure safety, equitability, and unbiased decision-making of ML systems. In particular, we investigate methods for building ML models that are more debuggable and provide tools for diagnosing their failure modes. We then study how data affects model behavior, identify unexpected ways in which data might introduce biases into ML models, particularly in the context of transfer learning. Finally, we put forth a data-based framework for studying transfer learning which can help us discover problematic biases inherited from pretraining data. Ph.D. 2023-11-02T20:22:53Z 2023-11-02T20:22:53Z 2023-09 2023-09-21T14:25:54.648Z Thesis https://hdl.handle.net/1721.1/152859 https://orcid.org/0009-0008-9611-4702 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Salman, Hadi Towards ML Models That We Can Deploy Confidently
title	Towards ML Models That We Can Deploy Confidently
title_full	Towards ML Models That We Can Deploy Confidently
title_fullStr	Towards ML Models That We Can Deploy Confidently
title_full_unstemmed	Towards ML Models That We Can Deploy Confidently
title_short	Towards ML Models That We Can Deploy Confidently
title_sort	towards ml models that we can deploy confidently
url	https://hdl.handle.net/1721.1/152859 https://orcid.org/0009-0008-9611-4702
work_keys_str_mv	AT salmanhadi towardsmlmodelsthatwecandeployconfidently

Towards ML Models That We Can Deploy Confidently

Similar Items