Multi-Level Cross-Modal Semantic Alignment Network for Video–Text Retrieval
This paper strives to improve the performance of video–text retrieval. To date, many algorithms have been proposed to facilitate the similarity measure of video–text retrieval from the single global semantic to multi-level semantics. However, these methods may suffer from the following limitations:...
Main Authors: | Fudong Nian, Ling Ding, Yuxia Hu, Yanhong Gu |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-09-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/10/18/3346 |
Similar Items
-
A cross-modal conditional mechanism based on attention for text-video retrieval
by: Wanru Du, et al.
Published: (2023-11-01) -
Cross-Modal Learning Based on Semantic Correlation and Multi-Task Learning for Text-Video Retrieval
by: Xiaoyu Wu, et al.
Published: (2020-12-01) -
Cross-Modal Retrieval via Similarity-Preserving Learning and Semantic Average Embedding
by: Tao Zhi, et al.
Published: (2020-01-01) -
Level-wise aligned dual networks for text–video retrieval
by: Qiubin Lin, et al.
Published: (2022-07-01) -
Deep Multi-Modal Metric Learning with Multi-Scale Correlation for Image-Text Retrieval
by: Yan Hua, et al.
Published: (2020-03-01)