Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval

The core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it...

Full description

Bibliographic Details
Main Authors:	Ge Shi, Feng Li, Lifang Wu, Yukun Chen
Format:	Article
Language:	English
Published:	MDPI AG 2022-04-01
Series:	Sensors
Subjects:	cross-modal hash learning deep model hashing retrieval
Online Access:	https://www.mdpi.com/1424-8220/22/8/2921

_version_	1827599420451979264
author	Ge Shi Feng Li Lifang Wu Yukun Chen
author_facet	Ge Shi Feng Li Lifang Wu Yukun Chen
author_sort	Ge Shi
collection	DOAJ
description	The core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it only relies on relevant information of the paired data, making it more applicable to real-world applications. However, two problems, that is intro-modality correlation and inter-modality correlation, still have not been fully considered. Intra-modality correlation describes the complex overall concept of a single modality and provides semantic relevance for retrieval tasks, while inter-modality correction refers to the relationship between different modalities. From our observation and hypothesis, the dependency relationship within the modality and between different modalities can be constructed at the object level, which can further improve cross-modal hashing retrieval accuracy. To this end, we propose a Visual-textful Correlation Graph Hashing (OVCGH) approach to mine the fine-grained object-level similarity in cross-modal data while suppressing noise interference. Specifically, a novel intra-modality correlation graph is designed to learn graph-level representations of different modalities, obtaining the dependency relationship of the image region to image region and the tag to tag in an unsupervised manner. Then, we design a visual-text dependency building module that can capture correlation semantic information between different modalities by modeling the dependency relationship between image object region and text tag. Extensive experiments on two widely used datasets verify the effectiveness of our proposed approach.
first_indexed	2024-03-09T04:13:55Z
format	Article
id	doaj.art-45e3a86afe894d62a76e514a1852877b
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-09T04:13:55Z
publishDate	2022-04-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-45e3a86afe894d62a76e514a1852877b2023-12-03T13:56:49ZengMDPI AGSensors1424-82202022-04-01228292110.3390/s22082921Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal RetrievalGe Shi0Feng Li1Lifang Wu2Yukun Chen3Faculty of Information Technology, Beijing University of Technology, Beijing 100124, ChinaFaculty of Information Technology, Beijing University of Technology, Beijing 100124, ChinaFaculty of Information Technology, Beijing University of Technology, Beijing 100124, ChinaFaculty of Information Technology, Beijing University of Technology, Beijing 100124, ChinaThe core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it only relies on relevant information of the paired data, making it more applicable to real-world applications. However, two problems, that is intro-modality correlation and inter-modality correlation, still have not been fully considered. Intra-modality correlation describes the complex overall concept of a single modality and provides semantic relevance for retrieval tasks, while inter-modality correction refers to the relationship between different modalities. From our observation and hypothesis, the dependency relationship within the modality and between different modalities can be constructed at the object level, which can further improve cross-modal hashing retrieval accuracy. To this end, we propose a Visual-textful Correlation Graph Hashing (OVCGH) approach to mine the fine-grained object-level similarity in cross-modal data while suppressing noise interference. Specifically, a novel intra-modality correlation graph is designed to learn graph-level representations of different modalities, obtaining the dependency relationship of the image region to image region and the tag to tag in an unsupervised manner. Then, we design a visual-text dependency building module that can capture correlation semantic information between different modalities by modeling the dependency relationship between image object region and text tag. Extensive experiments on two widely used datasets verify the effectiveness of our proposed approach.https://www.mdpi.com/1424-8220/22/8/2921cross-modal hash learningdeep modelhashing retrieval
spellingShingle	Ge Shi Feng Li Lifang Wu Yukun Chen Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval Sensors cross-modal hash learning deep model hashing retrieval
title	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_full	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_fullStr	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_full_unstemmed	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_short	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_sort	object level visual text correlation graph hashing for unsupervised cross modal retrieval
topic	cross-modal hash learning deep model hashing retrieval
url	https://www.mdpi.com/1424-8220/22/8/2921
work_keys_str_mv	AT geshi objectlevelvisualtextcorrelationgraphhashingforunsupervisedcrossmodalretrieval AT fengli objectlevelvisualtextcorrelationgraphhashingforunsupervisedcrossmodalretrieval AT lifangwu objectlevelvisualtextcorrelationgraphhashingforunsupervisedcrossmodalretrieval AT yukunchen objectlevelvisualtextcorrelationgraphhashingforunsupervisedcrossmodalretrieval

Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval

Similar Items