Labelling unlabelled videos from scratch with multi-modal self-supervision
A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: of labeled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image domain, recent methods have allowed to generate meaningfu...
Những tác giả chính: | , , , |
---|---|
Định dạng: | Conference item |
Ngôn ngữ: | English |
Được phát hành: |
NeurIPS
2020
|