PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification

Given a textual query, text-based person re-identification is supposed to search for the targeted pedestrian images from a large-scale visual database. Due to the inherent heterogeneity between different modalities, it is challenging to measure the cross-modal affinity between visual and textual dat...

Full description

Bibliographic Details
Main Authors:	Chao Liu, Jingyi Xue, Zijie Wang, Aichun Zhu
Format:	Article
Language:	English
Published:	MDPI AG 2023-10-01
Series:	Applied Sciences
Subjects:	text-based person retrieval person re-identification multi-granular matching
Online Access:	https://www.mdpi.com/2076-3417/13/21/11876

_version_	1827765885715087360
author	Chao Liu Jingyi Xue Zijie Wang Aichun Zhu
author_facet	Chao Liu Jingyi Xue Zijie Wang Aichun Zhu
author_sort	Chao Liu
collection	DOAJ
description	Given a textual query, text-based person re-identification is supposed to search for the targeted pedestrian images from a large-scale visual database. Due to the inherent heterogeneity between different modalities, it is challenging to measure the cross-modal affinity between visual and textual data. Existing works typically employ single-granular methods to extract local features and align image regions with relevant words/phrases. Nevertheless, the limited robustness of single-granular methods cannot adapt to the imprecision and variances of visual and textual features, which are usually influenced by the background clutter, position transformation, posture diversity, and occlusion in surveillance videos, thereby leading to the deterioration of cross-modal matching accuracy. In this paper, we propose a Pyramidal Multi-Granular matching network (PMG) that incorporates a gradual transition process between the coarsest global information and the finest local information by a coarse-to-fine pyramidal method for multi-granular cross-modal features extraction and affinities learning. For each body part of a pedestrian, PMG is adequate in ensuring the integrity of local information while minimizing the surrounding interference signals at a certain scale and can adapt to capture discriminative signals of different body parts and achieve semantically alignment between image strips with relevant textual descriptions, thus suppressing the variances of feature extraction and improving the robustness of feature matching. Comprehensive experiments are conducted on the CUHK-PEDES and RSTPReid datasets to validate the effectiveness of the proposed method and results show that PMG outperforms state-of-the-art (SOTA) methods significantly and yields competitive accuracy of cross-modal retrieval.
first_indexed	2024-03-11T11:34:10Z
format	Article
id	doaj.art-f8a18bcc38474fc2a651d44da31ac23b
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-11T11:34:10Z
publishDate	2023-10-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-f8a18bcc38474fc2a651d44da31ac23b2023-11-10T14:59:01ZengMDPI AGApplied Sciences2076-34172023-10-0113211187610.3390/app132111876PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-IdentificationChao Liu0Jingyi Xue1Zijie Wang2Aichun Zhu3School of Intelligent Science and Control Engineering, Jinling Institute of Technology, Nanjing 211199, ChinaSchool of Computer Science and Technology, Nanjing Tech University, Nanjing 211816, ChinaSchool of Computer Science and Technology, Nanjing Tech University, Nanjing 211816, ChinaSchool of Computer Science and Technology, Nanjing Tech University, Nanjing 211816, ChinaGiven a textual query, text-based person re-identification is supposed to search for the targeted pedestrian images from a large-scale visual database. Due to the inherent heterogeneity between different modalities, it is challenging to measure the cross-modal affinity between visual and textual data. Existing works typically employ single-granular methods to extract local features and align image regions with relevant words/phrases. Nevertheless, the limited robustness of single-granular methods cannot adapt to the imprecision and variances of visual and textual features, which are usually influenced by the background clutter, position transformation, posture diversity, and occlusion in surveillance videos, thereby leading to the deterioration of cross-modal matching accuracy. In this paper, we propose a Pyramidal Multi-Granular matching network (PMG) that incorporates a gradual transition process between the coarsest global information and the finest local information by a coarse-to-fine pyramidal method for multi-granular cross-modal features extraction and affinities learning. For each body part of a pedestrian, PMG is adequate in ensuring the integrity of local information while minimizing the surrounding interference signals at a certain scale and can adapt to capture discriminative signals of different body parts and achieve semantically alignment between image strips with relevant textual descriptions, thus suppressing the variances of feature extraction and improving the robustness of feature matching. Comprehensive experiments are conducted on the CUHK-PEDES and RSTPReid datasets to validate the effectiveness of the proposed method and results show that PMG outperforms state-of-the-art (SOTA) methods significantly and yields competitive accuracy of cross-modal retrieval.https://www.mdpi.com/2076-3417/13/21/11876text-based person retrievalperson re-identificationmulti-granular matching
spellingShingle	Chao Liu Jingyi Xue Zijie Wang Aichun Zhu PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification Applied Sciences text-based person retrieval person re-identification multi-granular matching
title	PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification
title_full	PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification
title_fullStr	PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification
title_full_unstemmed	PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification
title_short	PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification
title_sort	pmg pyramidal multi granular matching for text based person re identification
topic	text-based person retrieval person re-identification multi-granular matching
url	https://www.mdpi.com/2076-3417/13/21/11876
work_keys_str_mv	AT chaoliu pmgpyramidalmultigranularmatchingfortextbasedpersonreidentification AT jingyixue pmgpyramidalmultigranularmatchingfortextbasedpersonreidentification AT zijiewang pmgpyramidalmultigranularmatchingfortextbasedpersonreidentification AT aichunzhu pmgpyramidalmultigranularmatchingfortextbasedpersonreidentification

PMG—Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification

Similar Items