Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification

Video based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local informat...

Full description

Bibliographic Details
Main Authors: Jiali Xi, Qin Zhou, Yiru Zhao, Shibao Zheng
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8782110/
_version_ 1818609185504362496
author Jiali Xi
Qin Zhou
Yiru Zhao
Shibao Zheng
author_facet Jiali Xi
Qin Zhou
Yiru Zhao
Shibao Zheng
author_sort Jiali Xi
collection DOAJ
description Video based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local information is inaccessible. To address this issue, we propose a novel module called fine-grained fusion with distractor suppression (short as FFDS) to fully exploit the local features towards better representation of a specific video clip. Concretely, in the proposed FFDS module, the importance of each local feature of an anchor image is calculated by pixel-wise correlation mining with other intra-sequence frames. In this way, 'good' local features co-exist across the video frames are enhanced in the attention map, while sparse 'distractors' can be suppressed. Moreover, to maintain the high-level semantic information of deep CNN features as well as enjoying the fine-grained local information, we adopt the feature mimicking scheme during the training process. Extensive experiments on two challenging large-scale datasets demonstrate effectiveness of the proposed method.
first_indexed 2024-12-16T14:54:31Z
format Article
id doaj.art-13033e05ca74495fbda383bbb64f3b47
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-16T14:54:31Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-13033e05ca74495fbda383bbb64f3b472022-12-21T22:27:29ZengIEEEIEEE Access2169-35362019-01-01711431011431910.1109/ACCESS.2019.29321028782110Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-IdentificationJiali Xi0https://orcid.org/0000-0002-0344-4374Qin Zhou1Yiru Zhao2Shibao Zheng3Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaAlibaba Group, Hangzhou, ChinaInstitute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaInstitute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaVideo based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local information is inaccessible. To address this issue, we propose a novel module called fine-grained fusion with distractor suppression (short as FFDS) to fully exploit the local features towards better representation of a specific video clip. Concretely, in the proposed FFDS module, the importance of each local feature of an anchor image is calculated by pixel-wise correlation mining with other intra-sequence frames. In this way, 'good' local features co-exist across the video frames are enhanced in the attention map, while sparse 'distractors' can be suppressed. Moreover, to maintain the high-level semantic information of deep CNN features as well as enjoying the fine-grained local information, we adopt the feature mimicking scheme during the training process. Extensive experiments on two challenging large-scale datasets demonstrate effectiveness of the proposed method.https://ieeexplore.ieee.org/document/8782110/Person re-identificationimage retrievaldeep learningattention module
spellingShingle Jiali Xi
Qin Zhou
Yiru Zhao
Shibao Zheng
Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
IEEE Access
Person re-identification
image retrieval
deep learning
attention module
title Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_full Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_fullStr Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_full_unstemmed Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_short Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
title_sort fine grained fusion with distractor suppression for video based person re identification
topic Person re-identification
image retrieval
deep learning
attention module
url https://ieeexplore.ieee.org/document/8782110/
work_keys_str_mv AT jialixi finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification
AT qinzhou finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification
AT yiruzhao finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification
AT shibaozheng finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification