Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
Video based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local informat...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8782110/ |
_version_ | 1818609185504362496 |
---|---|
author | Jiali Xi Qin Zhou Yiru Zhao Shibao Zheng |
author_facet | Jiali Xi Qin Zhou Yiru Zhao Shibao Zheng |
author_sort | Jiali Xi |
collection | DOAJ |
description | Video based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local information is inaccessible. To address this issue, we propose a novel module called fine-grained fusion with distractor suppression (short as FFDS) to fully exploit the local features towards better representation of a specific video clip. Concretely, in the proposed FFDS module, the importance of each local feature of an anchor image is calculated by pixel-wise correlation mining with other intra-sequence frames. In this way, 'good' local features co-exist across the video frames are enhanced in the attention map, while sparse 'distractors' can be suppressed. Moreover, to maintain the high-level semantic information of deep CNN features as well as enjoying the fine-grained local information, we adopt the feature mimicking scheme during the training process. Extensive experiments on two challenging large-scale datasets demonstrate effectiveness of the proposed method. |
first_indexed | 2024-12-16T14:54:31Z |
format | Article |
id | doaj.art-13033e05ca74495fbda383bbb64f3b47 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-16T14:54:31Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-13033e05ca74495fbda383bbb64f3b472022-12-21T22:27:29ZengIEEEIEEE Access2169-35362019-01-01711431011431910.1109/ACCESS.2019.29321028782110Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-IdentificationJiali Xi0https://orcid.org/0000-0002-0344-4374Qin Zhou1Yiru Zhao2Shibao Zheng3Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaAlibaba Group, Hangzhou, ChinaInstitute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaInstitute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, ChinaVideo based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local information is inaccessible. To address this issue, we propose a novel module called fine-grained fusion with distractor suppression (short as FFDS) to fully exploit the local features towards better representation of a specific video clip. Concretely, in the proposed FFDS module, the importance of each local feature of an anchor image is calculated by pixel-wise correlation mining with other intra-sequence frames. In this way, 'good' local features co-exist across the video frames are enhanced in the attention map, while sparse 'distractors' can be suppressed. Moreover, to maintain the high-level semantic information of deep CNN features as well as enjoying the fine-grained local information, we adopt the feature mimicking scheme during the training process. Extensive experiments on two challenging large-scale datasets demonstrate effectiveness of the proposed method.https://ieeexplore.ieee.org/document/8782110/Person re-identificationimage retrievaldeep learningattention module |
spellingShingle | Jiali Xi Qin Zhou Yiru Zhao Shibao Zheng Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification IEEE Access Person re-identification image retrieval deep learning attention module |
title | Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification |
title_full | Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification |
title_fullStr | Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification |
title_full_unstemmed | Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification |
title_short | Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification |
title_sort | fine grained fusion with distractor suppression for video based person re identification |
topic | Person re-identification image retrieval deep learning attention module |
url | https://ieeexplore.ieee.org/document/8782110/ |
work_keys_str_mv | AT jialixi finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification AT qinzhou finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification AT yiruzhao finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification AT shibaozheng finegrainedfusionwithdistractorsuppressionforvideobasedpersonreidentification |