Fusion in dissimilarity space for RGB-D person re-identification

Person re-identification (Re-id) is the task of recognizing people across different non-overlapping sensors of a camera network. Despite the recent advances with deep learning (DL) models for multi-modal fusion, state-of-the-art Re-id approaches fail to leverage the depth guided contextual informati...

Full description

Bibliographic Details
Main Authors: Md Kamal Uddin, Antony Lam, Hisato Fukuda, Yoshinori Kobayashi, Yoshinori Kuno
Format: Article
Language:English
Published: Elsevier 2021-12-01
Series:Array
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590005621000369
_version_ 1828957547352031232
author Md Kamal Uddin
Antony Lam
Hisato Fukuda
Yoshinori Kobayashi
Yoshinori Kuno
author_facet Md Kamal Uddin
Antony Lam
Hisato Fukuda
Yoshinori Kobayashi
Yoshinori Kuno
author_sort Md Kamal Uddin
collection DOAJ
description Person re-identification (Re-id) is the task of recognizing people across different non-overlapping sensors of a camera network. Despite the recent advances with deep learning (DL) models for multi-modal fusion, state-of-the-art Re-id approaches fail to leverage the depth guided contextual information to dynamically select the most discriminant convolutional filters for better feature embedding and inference. Thanks to low cost modern RGB-D sensors (e.g. Microsoft Kinect and Intel RealSense Depth camera) that avail us with different modalities such as illumination invariant high-quality depth images, RGB images and skeleton information can be obtained simultaneously. State-of-the-art Re-id approaches utilize multi-modal fusion in feature space where the chances of fused noisy features to dominate the final recognition process is high. In this paper, we address this issue by exploiting the advantage of using an effective fusion technique in dissimilarity space. Given a query RGB-D image of an individual, two CNNs are separately trained with 3-channel RGB and 4-channel RGB-D images to produce two different feature embeddings required for pair-wise matching with embeddings for reference images, where dissimilarity scores w.r.t reference images from both modalities are fused together for final ranking. Additionally, lack of a proper RGB-D Re-id dataset prompts us to contribute a new RGB-D Re-id dataset named SUCVL RGBD-ID, including RGB and depth images of 58 identities from three cameras where one camera was installed in poor illumination conditions and the remaining two cameras were installed in two different indoor locations with different indoor lighting environments. Extensive experimental analysis on our dataset and two publicly available datasets show the effectiveness of our proposed method. Moreover, our proposed method is general and can be applied to a multitude of different RGB-D based applications.
first_indexed 2024-12-14T08:30:39Z
format Article
id doaj.art-cd1f383f5a2d4e68a95359de140f234d
institution Directory Open Access Journal
issn 2590-0056
language English
last_indexed 2024-12-14T08:30:39Z
publishDate 2021-12-01
publisher Elsevier
record_format Article
series Array
spelling doaj.art-cd1f383f5a2d4e68a95359de140f234d2022-12-21T23:09:31ZengElsevierArray2590-00562021-12-0112100089Fusion in dissimilarity space for RGB-D person re-identificationMd Kamal Uddin0Antony Lam1Hisato Fukuda2Yoshinori Kobayashi3Yoshinori Kuno4Graduate School of Science and Engineering, Saitama University, Saitama, Japan; Noakhali Science and Technology University, Noakhali, Bangladesh; Corresponding author. Graduate School of Science and Engineering, Saitama University, Saitama, Japan.Mercari, Inc., Tokyo, JapanGraduate School of Science and Engineering, Saitama University, Saitama, JapanGraduate School of Science and Engineering, Saitama University, Saitama, JapanGraduate School of Science and Engineering, Saitama University, Saitama, JapanPerson re-identification (Re-id) is the task of recognizing people across different non-overlapping sensors of a camera network. Despite the recent advances with deep learning (DL) models for multi-modal fusion, state-of-the-art Re-id approaches fail to leverage the depth guided contextual information to dynamically select the most discriminant convolutional filters for better feature embedding and inference. Thanks to low cost modern RGB-D sensors (e.g. Microsoft Kinect and Intel RealSense Depth camera) that avail us with different modalities such as illumination invariant high-quality depth images, RGB images and skeleton information can be obtained simultaneously. State-of-the-art Re-id approaches utilize multi-modal fusion in feature space where the chances of fused noisy features to dominate the final recognition process is high. In this paper, we address this issue by exploiting the advantage of using an effective fusion technique in dissimilarity space. Given a query RGB-D image of an individual, two CNNs are separately trained with 3-channel RGB and 4-channel RGB-D images to produce two different feature embeddings required for pair-wise matching with embeddings for reference images, where dissimilarity scores w.r.t reference images from both modalities are fused together for final ranking. Additionally, lack of a proper RGB-D Re-id dataset prompts us to contribute a new RGB-D Re-id dataset named SUCVL RGBD-ID, including RGB and depth images of 58 identities from three cameras where one camera was installed in poor illumination conditions and the remaining two cameras were installed in two different indoor locations with different indoor lighting environments. Extensive experimental analysis on our dataset and two publicly available datasets show the effectiveness of our proposed method. Moreover, our proposed method is general and can be applied to a multitude of different RGB-D based applications.http://www.sciencedirect.com/science/article/pii/S2590005621000369Re-identificationRGB-D sensorsDissimilarity spaceTriplet loss
spellingShingle Md Kamal Uddin
Antony Lam
Hisato Fukuda
Yoshinori Kobayashi
Yoshinori Kuno
Fusion in dissimilarity space for RGB-D person re-identification
Array
Re-identification
RGB-D sensors
Dissimilarity space
Triplet loss
title Fusion in dissimilarity space for RGB-D person re-identification
title_full Fusion in dissimilarity space for RGB-D person re-identification
title_fullStr Fusion in dissimilarity space for RGB-D person re-identification
title_full_unstemmed Fusion in dissimilarity space for RGB-D person re-identification
title_short Fusion in dissimilarity space for RGB-D person re-identification
title_sort fusion in dissimilarity space for rgb d person re identification
topic Re-identification
RGB-D sensors
Dissimilarity space
Triplet loss
url http://www.sciencedirect.com/science/article/pii/S2590005621000369
work_keys_str_mv AT mdkamaluddin fusionindissimilarityspaceforrgbdpersonreidentification
AT antonylam fusionindissimilarityspaceforrgbdpersonreidentification
AT hisatofukuda fusionindissimilarityspaceforrgbdpersonreidentification
AT yoshinorikobayashi fusionindissimilarityspaceforrgbdpersonreidentification
AT yoshinorikuno fusionindissimilarityspaceforrgbdpersonreidentification