Data Attribution: From Classifiers to Generative Models

The goal of data attribution is to trace model predictions back to training data. Despite a long line of work towards this goal, existing approaches to data attribution tend to force users to choose between computational tractability and efficacy. That is, computationally tractable methods can strug...

Full description

Bibliographic Details
Main Author:	Georgiev, Kristian
Other Authors:	Mądry, Aleksander
Format:	Thesis
Published:	Massachusetts Institute of Technology 2023
Online Access:	https://hdl.handle.net/1721.1/152676

_version_	1826188994287763456
author	Georgiev, Kristian
author2	Mądry, Aleksander
author_facet	Mądry, Aleksander Georgiev, Kristian
author_sort	Georgiev, Kristian
collection	MIT
description	The goal of data attribution is to trace model predictions back to training data. Despite a long line of work towards this goal, existing approaches to data attribution tend to force users to choose between computational tractability and efficacy. That is, computationally tractable methods can struggle with accurately attributing model predictions in non-convex settings (e.g., in the context of deep neural networks), while methods that are effective in such regimes require training thousands of models, which makes them impractical for large models or datasets. Moreover, existing methods are often tailored to the supervised learning setting, and are not well-defined for generative models. In this thesis, we introduce TRAK (Tracing with the Randomly-projected After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differentiable models. In particular, by leveraging only a handful of trained models, TRAK can match the performance of attribution methods that require training thousands of models. We first demonstrate the utility of TRAK across various modalities and scales in the supervised setting: image classifiers trained on ImageNet, vision-language models (CLIP), and language models (BERT and mT5). Then, we extend TRAK to the generative setting, and show that it can be used to attribute different classes of diffusion models (DDPMs and LDMs).
first_indexed	2024-09-23T08:08:11Z
format	Thesis
id	mit-1721.1/152676
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T08:08:11Z
publishDate	2023
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1526762023-11-03T04:08:26Z Data Attribution: From Classifiers to Generative Models Georgiev, Kristian Mądry, Aleksander Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science The goal of data attribution is to trace model predictions back to training data. Despite a long line of work towards this goal, existing approaches to data attribution tend to force users to choose between computational tractability and efficacy. That is, computationally tractable methods can struggle with accurately attributing model predictions in non-convex settings (e.g., in the context of deep neural networks), while methods that are effective in such regimes require training thousands of models, which makes them impractical for large models or datasets. Moreover, existing methods are often tailored to the supervised learning setting, and are not well-defined for generative models. In this thesis, we introduce TRAK (Tracing with the Randomly-projected After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differentiable models. In particular, by leveraging only a handful of trained models, TRAK can match the performance of attribution methods that require training thousands of models. We first demonstrate the utility of TRAK across various modalities and scales in the supervised setting: image classifiers trained on ImageNet, vision-language models (CLIP), and language models (BERT and mT5). Then, we extend TRAK to the generative setting, and show that it can be used to attribute different classes of diffusion models (DDPMs and LDMs). S.M. 2023-11-02T20:07:43Z 2023-11-02T20:07:43Z 2023-09 2023-09-21T14:25:57.309Z Thesis https://hdl.handle.net/1721.1/152676 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Georgiev, Kristian Data Attribution: From Classifiers to Generative Models
title	Data Attribution: From Classifiers to Generative Models
title_full	Data Attribution: From Classifiers to Generative Models
title_fullStr	Data Attribution: From Classifiers to Generative Models
title_full_unstemmed	Data Attribution: From Classifiers to Generative Models
title_short	Data Attribution: From Classifiers to Generative Models
title_sort	data attribution from classifiers to generative models
url	https://hdl.handle.net/1721.1/152676
work_keys_str_mv	AT georgievkristian dataattributionfromclassifierstogenerativemodels

Data Attribution: From Classifiers to Generative Models

Similar Items