Level-wise aligned dual networks for text–video retrieval
Abstract The vast amount of videos on the Internet makes efficient and accurate text–video retrieval tasks increasingly important. The current methods leverage a high-dimensional space to align video and text for these tasks. However, a high-dimensional space cannot fully use different levels of inf...
Hlavní autoři: | , , |
---|---|
Médium: | Článek |
Jazyk: | English |
Vydáno: |
SpringerOpen
2022-07-01
|
Edice: | EURASIP Journal on Advances in Signal Processing |
Témata: | |
On-line přístup: | https://doi.org/10.1186/s13634-022-00887-y |