Labelling unlabelled videos from scratch with multi-modal self-supervision

A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: of labeled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image domain, recent methods have allowed to generate meaningfu...

ver descrição completa

Detalhes bibliográficos
Main Authors: Asano, YM, Patrick, M, Rupprecht, C, Vedaldi, A
Formato: Conference item
Idioma:English
Publicado em: NeurIPS 2020