Multiple Visual-Semantic Embedding for Video Retrieval from Query Sentence
Visual-semantic embedding aims to learn a joint embedding space where related video and sentence instances are located close to each other. Most existing methods put instances in a single embedding space. However, they struggle to embed instances due to the difficulty of matching visual dynamics in...
Main Authors: | Huy Manh Nguyen, Tomo Miyazaki, Yoshihiro Sugaya, Shinichiro Omachi |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-04-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/7/3214 |
Similar Items
-
Video Caption Based Searching Using End-to-End Dense Captioning and Sentence Embeddings
by: Akshay Aggarwal, et al.
Published: (2020-06-01) -
A Video Representation Method Based on Multi-View Structure Preserving Embedding for Action Retrieval
by: Ke Zhang, et al.
Published: (2019-01-01) -
Neural sentence embedding models for semantic similarity estimation in the biomedical domain
by: Kathrin Blagec, et al.
Published: (2019-04-01) -
Semantic embeddings of generic objects for zero-shot learning
by: Tristan Hascoet, et al.
Published: (2019-01-01) -
On the Limitations of Visual-Semantic Embedding Networks for Image-to-Text Information Retrieval
by: Yan Gong, et al.
Published: (2021-07-01)