Use what you have: Video retrieval using representations from collaborative experts
The rapid growth of video on the internet has made searching for video content using natural language queries a significant challenge. Human generated queries for video datasets ‘in the wild’ vary a lot in terms of degree of specificity, with some queries describing ‘specific details’ such as the na...
Main Authors: | Liu, Y, Albanie, S, Nagrani, A, Zisserman, A |
---|---|
Format: | Conference item |
Language: | English |
Published: |
British Machine Vision Association
2020
|
Similar Items
-
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
by: Albanie, S, et al.
Published: (2018) -
Emotion recognition in speech using cross-modal transfer in the wild
by: Albanie, S, et al.
Published: (2018) -
Disentangled Speech Embeddings Using Cross-Modal Self-Supervision
by: Nagrani, A, et al.
Published: (2020) -
Learnable PINs: Cross-modal embeddings for person identity
by: Nagrani, A, et al.
Published: (2018) -
Seeing voices and hearing faces: Cross-modal biometric matching
by: Nagrani, A, et al.
Published: (2018)