On the Limitations of Visual-Semantic Embedding Networks for Image-to-Text Information Retrieval

On the Limitations of Visual-Semantic Embedding Networks for Image-to-Text Information Retrieval

Visual-semantic embedding (VSE) networks create joint image–text representations to map images and texts in a shared embedding space to enable various information retrieval-related tasks, such as image–text retrieval, image captioning, and visual question answering. The most recent state-of-the-art...

Full description

Bibliographic Details
Main Authors:	Yan Gong, Georgina Cosma, Hui Fang
Format:	Article
Language:	English
Published:	MDPI AG 2021-07-01
Series:	Journal of Imaging
Subjects:	visual-semantic embedding network multi-modal deep learning cross-modal information retrieval
Online Access:	https://www.mdpi.com/2313-433X/7/8/125

Similar Items

Deep Semantic Cross Modal Hashing Based on Graph Similarity of Modal-Specific
by: Junzheng Li
Published: (2021-01-01)

Learning Adequate Alignment and Interaction for Cross-Modal Retrieval
by: MingKang Wang, et al.
Published: (2023-12-01)

Hierarchical Semantic Loss and Confidence Estimator for Visual-Semantic Embedding-Based Zero-Shot Learning
by: Sanghyun Seo, et al.
Published: (2019-08-01)

MESH: A Flexible Manifold-Embedded Semantic Hashing for Cross-Modal Retrieval
by: Fangming Zhong, et al.
Published: (2020-01-01)

Image–Text Cross-Modal Retrieval with Instance Contrastive Embedding
by: Ruigeng Zeng, et al.
Published: (2024-01-01)

A Framework for Enabling Unpaired Multi-Modal Learning for Deep Cross-Modal Hashing Retrieval
by: Mikel Williams-Lekuona, et al.
Published: (2022-12-01)

Multiple Visual-Semantic Embedding for Video Retrieval from Query Sentence
by: Huy Manh Nguyen, et al.
Published: (2021-04-01)

Deep Multi-Modal Metric Learning with Multi-Scale Correlation for Image-Text Retrieval
by: Yan Hua, et al.
Published: (2020-03-01)

A Cross-Attention Mechanism Based on Regional-Level Semantic Features of Images for Cross-Modal Text-Image Retrieval in Remote Sensing
by: Fuzhong Zheng, et al.
Published: (2022-11-01)

Multi-Granularity Semantic Information Integration Graph for Cross-Modal Hash Retrieval
by: Zhichao Han, et al.
Published: (2024-01-01)

A Fine-Grained Semantic Alignment Method Specific to Aggregate Multi-Scale Information for Cross-Modal Remote Sensing Image Retrieval
by: Fuzhong Zheng, et al.
Published: (2023-10-01)

Multi-Level Cross-Modal Semantic Alignment Network for Video–Text Retrieval
by: Fudong Nian, et al.
Published: (2022-09-01)

Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
by: Shuli Cheng, et al.
Published: (2020-11-01)

Cross-Modal Retrieval via Similarity-Preserving Learning and Semantic Average Embedding
by: Tao Zhi, et al.
Published: (2020-01-01)

Cross-Modal Image Retrieval Considering Semantic Relationships With Many-to-Many Correspondence Loss
by: Huaying Zhang, et al.
Published: (2023-01-01)

Discrete Semantics-Guided Asymmetric Hashing for Large-Scale Multimedia Retrieval
by: Jun Long, et al.
Published: (2021-09-01)

The State of the Art for Cross-Modal Retrieval: A Survey
by: Kun Zhou, et al.
Published: (2023-01-01)

YuYin: a multi-task learning model of multi-modal e-commerce background music recommendation
by: Le Ma, et al.
Published: (2023-10-01)

Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
by: Ge Shi, et al.
Published: (2022-04-01)

AN OBJECT PROPERTIES FILTER FOR MULTI-MODALITY ONTOLOGY SEMANTIC IMAGE RETRIEVAL
by: Mohd Suffian Sulaiman, et al.
Published: (2017-05-01)

Semantic-Aligned Cross-Modal Visual Grounding Network with Transformers
by: Qianjun Zhang, et al.
Published: (2023-05-01)

The effect of combined sensory and semantic components on audio-visual speech perception in older adults
by: Corrina eMaguinness, et al.
Published: (2011-12-01)

Fine-grained similarity semantic preserving deep hashing for cross-modal retrieval
by: Guoyou Li, et al.
Published: (2023-04-01)

An End-to-End Framework Based on Vision-Language Fusion for Remote Sensing Cross-Modal Text-Image Retrieval
by: Liu He, et al.
Published: (2023-05-01)

Recallable Question Answering-Based Re-Ranking Considering Semantic Region for Cross-Modal Retrieval
by: Rintaro Yanagi, et al.
Published: (2023-01-01)

Learning Cross-Modal Aligned Representation With Graph Embedding
by: Youcai Zhang, et al.
Published: (2018-01-01)

Semantic Consistency Cross-Modal Retrieval With Semi-Supervised Graph Regularization
by: Gongwen Xu, et al.
Published: (2020-01-01)

A Deep Semantic Alignment Network for the Cross-Modal Image-Text Retrieval in Remote Sensing
by: Qimin Cheng, et al.
Published: (2021-01-01)

Retrieval of TV Talk-Show Speakers by Associating Audio Transcript to Visual Clusters
by: Yina Han, et al.
Published: (2017-01-01)

A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval
by: Sadaqat Ur Rehman, et al.
Published: (2018-01-01)

Text-Image Matching for Cross-Modal Remote Sensing Image Retrieval via Graph Neural Network
by: Hongfeng Yu, et al.
Published: (2023-01-01)

Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval
by: Kangying Li, et al.
Published: (2022-07-01)

Exploring latent weight factors and global information for food-oriented cross-modal retrieval
by: Wenyu Zhao, et al.
Published: (2023-12-01)

A Deep Hashing Technique for Remote Sensing Image-Sound Retrieval
by: Yaxiong Chen, et al.
Published: (2019-12-01)

Deep Common Semantic Space Embedding for Sketch-Based 3D Model Retrieval
by: Jing Bai, et al.
Published: (2019-04-01)

Modal-Dependent Retrieval Based on Mid-Level Semantic Enhancement Space
by: Shunxin Zheng, et al.
Published: (2019-01-01)

Learning a cross-modal hashing network for multimedia search
by: Tan, Yap Peng, et al.
Published: (2018)

Deep Hashing Similarity Learning for Cross-Modal Retrieval
by: Ying Ma, et al.
Published: (2024-01-01)

Text as a didactic unit in the representation aspect about the text-forming function of modality
by: Slavuta Tatyana Aleksandrovna
Published: (2020-12-01)

A Cross-Modal Hash Retrieval Method with Fused Triples
by: Wenxiao Li, et al.
Published: (2023-09-01)