End-to-end learning of visual representations from uncurated instructional videos
Annotating videos is cumbersome, expensive and not scalable. Yet, many strong video models still rely on manually annotated data. With the recent introduction of the HowTo100M dataset, narrated videos now offer the possibility of learning video representations without manual supervision. In this wor...
Príomhchruthaitheoirí: | Miech, A, Alayrac, J-B, Smaira, L, Laptev, I, Sivic, J, Zisserman, A |
---|---|
Formáid: | Conference item |
Teanga: | English |
Foilsithe / Cruthaithe: |
IEEE
2020
|
Míreanna comhchosúla
Míreanna comhchosúla
-
The visual centrifuge: Model-free layered video representations
de réir: Alayrac, J-B, et al.
Foilsithe / Cruthaithe: (2020) -
Visual grounding in video for unsupervised word translation
de réir: Sigurdsson, GA, et al.
Foilsithe / Cruthaithe: (2020) -
Semi-supervised learning of facial attributes in video
de réir: Cherniavsky, N, et al.
Foilsithe / Cruthaithe: (2013) -
Video Google: efficient visual search of videos
de réir: Sivic, J, et al.
Foilsithe / Cruthaithe: (2007) -
Deep learning for automated visual inspection of uncured rubber
de réir: Smith, James Thomas Howard
Foilsithe / Cruthaithe: (2018)