Learning Adequate Alignment and Interaction for Cross-Modal Retrieval

Learning Adequate Alignment and Interaction for Cross-Modal Retrieval

Cross-modal retrieval has attracted widespread attention in many cross-media similarity search applications, especially image-text retrieval in the fields of computer vision and natural language processing. Recently, visual and semantic embedding (VSE) learning has shown promising improvements on im...

Full description

Bibliographic Details
Main Authors:	MingKang Wang, Min Meng, Jigang Liu, Jigang Wu
Format:	Article
Language:	English
Published:	KeAi Communications Co., Ltd. 2023-12-01
Series:	Virtual Reality & Intelligent Hardware
Subjects:	Cross-modal Retrieval Visual Semantic Embedding Feature Aggregation Transformer
Online Access:	http://www.sciencedirect.com/science/article/pii/S209657962300027X

Similar Items

Text-Image Cross-modal Retrieval Based on Transformer
by: YANG Xiaoyu, LI Chao, CHEN Shunyao, LI Haoliang, YIN Guangqiang
Published: (2023-04-01)

Cross modal recipe retrieval with fine grained modal interaction
by: Fan Zhao, et al.
Published: (2025-02-01)

Deep Semantic Cross Modal Hashing Based on Graph Similarity of Modal-Specific
by: Junzheng Li
Published: (2021-01-01)

Cross-modal retrieval based on multi-dimensional feature fusion hashing
by: Dongxiao Ren, et al.
Published: (2024-06-01)

On the Limitations of Visual-Semantic Embedding Networks for Image-to-Text Information Retrieval
by: Yan Gong, et al.
Published: (2021-07-01)

Deep Hashing Similarity Learning for Cross-Modal Retrieval
by: Ying Ma, et al.
Published: (2024-01-01)

Disambiguity and Alignment: An Effective Multi-Modal Alignment Method for Cross-Modal Recipe Retrieval
by: Zhuoyang Zou, et al.
Published: (2024-05-01)

MESH: A Flexible Manifold-Embedded Semantic Hashing for Cross-Modal Retrieval
by: Fangming Zhong, et al.
Published: (2020-01-01)

Deep Label Feature Fusion Hashing for Cross-Modal Retrieval
by: Dongxiao Ren, et al.
Published: (2022-01-01)

An Enhanced Feature Extraction Framework for Cross-Modal Image–Text Retrieval
by: Jinzhi Zhang, et al.
Published: (2024-06-01)

Learning Cross-Modal Aligned Representation With Graph Embedding
by: Youcai Zhang, et al.
Published: (2018-01-01)

Deep Self-Supervised Hashing With Fine-Grained Similarity Mining for Cross-Modal Retrieval
by: Lijun Han, et al.
Published: (2024-01-01)

Survey of Research Progress on Cross-modal Retrieval
by: FENG Xia, HU Zhi-yi, LIU Cai-hua
Published: (2021-08-01)

Alleviating the inconsistency of multimodal data in cross-modal retrieval
by: Li, Tieying, et al.
Published: (2024)

Multiple Visual-Semantic Embedding for Video Retrieval from Query Sentence
by: Huy Manh Nguyen, et al.
Published: (2021-04-01)

A Fine-Grained Semantic Alignment Method Specific to Aggregate Multi-Scale Information for Cross-Modal Remote Sensing Image Retrieval
by: Fuzhong Zheng, et al.
Published: (2023-10-01)

Cross-Modal Image Retrieval Considering Semantic Relationships With Many-to-Many Correspondence Loss
by: Huaying Zhang, et al.
Published: (2023-01-01)

Object Feature Based Deep Hashing for Cross-Modal Retrieval
by: ZHU Jie, BAI Hongyu, ZHANG Zhongyu, XIE Bojun, ZHANG Junsan
Published: (2021-05-01)

Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
by: Shuli Cheng, et al.
Published: (2020-11-01)

ClusterE-ZSL: A Novel Cluster-Based Embedding for Enhanced Zero-Shot Learning in Contrastive Pre-Training Cross-Modal Retrieval
by: Umair Tariq, et al.
Published: (2024-01-01)

A Cross-Modal Semantic Alignment and Feature Fusion Method for Bionic Drone and Bird Recognition
by: Hehao Liu, et al.
Published: (2024-08-01)

Supervised Contrastive Learning for 3D Cross-Modal Retrieval
by: Yeon-Seung Choo, et al.
Published: (2024-11-01)

Deep Feature-Based Neighbor Similarity Hashing With Adversarial Learning for Cross-Modal Retrieval
by: Kun Li, et al.
Published: (2024-01-01)

Cross-Modal Retrieval: A Review of Methodologies, Datasets, and Future Perspectives
by: Zhichao Han, et al.
Published: (2024-01-01)

Transformer-Based Discriminative and Strong Representation Deep Hashing for Cross-Modal Retrieval
by: Suqing Zhou, et al.
Published: (2023-01-01)

Contrasting Dual Transformer Architectures for Multi-Modal Remote Sensing Image Retrieval
by: Mohamad M. Al Rahhal, et al.
Published: (2022-12-01)

Text-Image Matching for Cross-Modal Remote Sensing Image Retrieval via Graph Neural Network
by: Hongfeng Yu, et al.
Published: (2023-01-01)

Bridging the gap: multi-granularity representation learning for text-based vehicle retrieval
by: Xue Bo, et al.
Published: (2024-11-01)

Multi-Level Cross-Modal Semantic Alignment Network for Video–Text Retrieval
by: Fudong Nian, et al.
Published: (2022-09-01)

The State of the Art for Cross-Modal Retrieval: A Survey
by: Kun Zhou, et al.
Published: (2023-01-01)

Semantic-Aligned Cross-Modal Visual Grounding Network with Transformers
by: Qianjun Zhang, et al.
Published: (2023-05-01)

A Framework for Enabling Unpaired Multi-Modal Learning for Deep Cross-Modal Hashing Retrieval
by: Mikel Williams-Lekuona, et al.
Published: (2022-12-01)

A Cross-Modal Hash Retrieval Method with Fused Triples
by: Wenxiao Li, et al.
Published: (2023-09-01)

Hierarchical Semantic Loss and Confidence Estimator for Visual-Semantic Embedding-Based Zero-Shot Learning
by: Sanghyun Seo, et al.
Published: (2019-08-01)

Fusion of Textural and Visual Information for Medical Image Modality Retrieval Using Deep Learning-Based Feature Engineering
by: Saeed Iqbal, et al.
Published: (2023-01-01)

Online Adaptive Supervised Hashing for Large-Scale Cross-Modal Retrieval
by: Ruoqi Su, et al.
Published: (2020-01-01)

Cross Task Modality Alignment Network for Sketch Face Recognition
by: Yanan Guo, et al.
Published: (2022-06-01)

A Cross-Attention Mechanism Based on Regional-Level Semantic Features of Images for Cross-Modal Text-Image Retrieval in Remote Sensing
by: Fuzhong Zheng, et al.
Published: (2022-11-01)

Supervised Intra- and Inter-Modality Similarity Preserving Hashing for Cross-Modal Retrieval
by: Zhikui Chen, et al.
Published: (2018-01-01)

Remote Sensing Cross-Modal Text-Image Retrieval Based on Attention Correction and Filtering
by: Xiaoyu Yang, et al.
Published: (2025-01-01)