Frozen in time: A joint video and image encoder for end-to-end retrieval
Our objective in this work is video-text retrieval – in particular a joint embedding that enables efficient text-to-video retrieval. The challenges in this area include the design of the visual architecture and the nature of the training data, in that the available large scale video-text training da...
Príomhchruthaitheoirí: | , , , |
---|---|
Formáid: | Conference item |
Teanga: | English |
Foilsithe / Cruthaithe: |
IEEE
2022
|