Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification

Video based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local informat...

Full description

Bibliographic Details
Main Authors:	Jiali Xi, Qin Zhou, Yiru Zhao, Shibao Zheng
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Person re-identification image retrieval deep learning attention module
Online Access:	https://ieeexplore.ieee.org/document/8782110/

_version_	1818609185504362496
author	Jiali Xi Qin Zhou Yiru Zhao Shibao Zheng
author_facet	Jiali Xi Qin Zhou Yiru Zhao Shibao Zheng
author_sort	Jiali Xi
collection	DOAJ
description	Video based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local information is inaccessible. To address this issue, we propose a novel module called fine-grained fusion with distractor suppression (short as FFDS) to fully exploit the local features towards better representation of a specific video clip. Concretely, in the proposed FFDS module, the importance of each local feature of an anchor image is calculated by pixel-wise correlation mining with other intra-sequence frames. In this way, 'good' local features co-exist across the video frames are enhanced in the attention map, while sparse 'distractors' can be suppressed. Moreover, to maintain the high-level semantic information of deep CNN features as well as enjoying the fine-grained local information, we adopt the feature mimicking scheme during the training process. Extensive experiments on two challenging large-scale datasets demonstrate effectiveness of the proposed method.
first_indexed	2024-12-16T14:54:31Z
format	Article
id	doaj.art-13033e05ca74495fbda383bbb64f3b47
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-16T14:54:31Z
publishDate	2019-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-13033e05ca74495fbda383bbb64f3b472022-12-21T22:27:29ZengIEEEIEEE Access2169-35362019-01-01711431011431910.1109/ACCESS.2019.29321028782110Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-IdentificationJiali Xi0https://orcid.org/0000-0002-0344-4374Qin Zhou1Yiru Zhao2Shibao Zheng3Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaAlibaba Group, Hangzhou, ChinaInstitute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaInstitute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaVideo based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local information is inaccessible. To address this issue, we propose a novel module called fine-grained fusion with distractor suppression (short as FFDS) to fully exploit the local features towards better representation of a specific video clip. Concretely, in the proposed FFDS module, the importance of each local feature of an anchor image is calculated by pixel-wise correlation mining with other intra-sequence frames. In this way, 'good' local features co-exist across the video frames are enhanced in the attention map, while sparse 'distractors' can be suppressed. Moreover, to maintain the high-level semantic information of deep CNN features as well as enjoying the fine-grained local information, we adopt the feature mimicking scheme during the training process. Extensive experiments on two challenging large-scale datasets demonstrate effectiveness of the proposed method.https://ieeexplore.ieee.org/document/8782110/Person re-identificationimage retrievaldeep learningattention module
spellingShingle	Jiali Xi Qin Zhou Yiru Zhao Shibao Zheng Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification IEEE Access Person re-identification image retrieval deep learning attention module
title	Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_full	Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_fullStr	Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_full_unstemmed	Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_short	Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_sort	fine grained fusion with distractor suppression for video based person re identification
topic	Person re-identification image retrieval deep learning attention module
url	https://ieeexplore.ieee.org/document/8782110/
work_keys_str_mv	AT jialixi finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification AT qinzhou finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification AT yiruzhao finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification AT shibaozheng finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification

Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification

Similar Items