On pretraining data diversity for self-supervised learning

We explore the impact of training with more diverse datasets, characterized by the number of unique samples, on the performance of self-supervised learning (SSL) under a fixed computational budget. Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performa...

Full description

Bibliographic Details
Main Authors: Hammoud, HAAK, Das, T, Pizzati, F, Torr, P, Bibi, A, Ghanem, B
Format: Conference item
Language:English
Published: Springer 2024