Fusion in dissimilarity space for RGB-D person re-identification

Person re-identification (Re-id) is the task of recognizing people across different non-overlapping sensors of a camera network. Despite the recent advances with deep learning (DL) models for multi-modal fusion, state-of-the-art Re-id approaches fail to leverage the depth guided contextual informati...

Full description

Bibliographic Details
Main Authors:	Md Kamal Uddin, Antony Lam, Hisato Fukuda, Yoshinori Kobayashi, Yoshinori Kuno
Format:	Article
Language:	English
Published:	Elsevier 2021-12-01
Series:	Array
Subjects:	Re-identification RGB-D sensors Dissimilarity space Triplet loss
Online Access:	http://www.sciencedirect.com/science/article/pii/S2590005621000369

_version_	1828957547352031232
author	Md Kamal Uddin Antony Lam Hisato Fukuda Yoshinori Kobayashi Yoshinori Kuno
author_facet	Md Kamal Uddin Antony Lam Hisato Fukuda Yoshinori Kobayashi Yoshinori Kuno
author_sort	Md Kamal Uddin
collection	DOAJ
description	Person re-identification (Re-id) is the task of recognizing people across different non-overlapping sensors of a camera network. Despite the recent advances with deep learning (DL) models for multi-modal fusion, state-of-the-art Re-id approaches fail to leverage the depth guided contextual information to dynamically select the most discriminant convolutional filters for better feature embedding and inference. Thanks to low cost modern RGB-D sensors (e.g. Microsoft Kinect and Intel RealSense Depth camera) that avail us with different modalities such as illumination invariant high-quality depth images, RGB images and skeleton information can be obtained simultaneously. State-of-the-art Re-id approaches utilize multi-modal fusion in feature space where the chances of fused noisy features to dominate the final recognition process is high. In this paper, we address this issue by exploiting the advantage of using an effective fusion technique in dissimilarity space. Given a query RGB-D image of an individual, two CNNs are separately trained with 3-channel RGB and 4-channel RGB-D images to produce two different feature embeddings required for pair-wise matching with embeddings for reference images, where dissimilarity scores w.r.t reference images from both modalities are fused together for final ranking. Additionally, lack of a proper RGB-D Re-id dataset prompts us to contribute a new RGB-D Re-id dataset named SUCVL RGBD-ID, including RGB and depth images of 58 identities from three cameras where one camera was installed in poor illumination conditions and the remaining two cameras were installed in two different indoor locations with different indoor lighting environments. Extensive experimental analysis on our dataset and two publicly available datasets show the effectiveness of our proposed method. Moreover, our proposed method is general and can be applied to a multitude of different RGB-D based applications.
first_indexed	2024-12-14T08:30:39Z
format	Article
id	doaj.art-cd1f383f5a2d4e68a95359de140f234d
institution	Directory Open Access Journal
issn	2590-0056
language	English
last_indexed	2024-12-14T08:30:39Z
publishDate	2021-12-01
publisher	Elsevier
record_format	Article
series	Array
spelling	doaj.art-cd1f383f5a2d4e68a95359de140f234d2022-12-21T23:09:31ZengElsevierArray2590-00562021-12-0112100089Fusion in dissimilarity space for RGB-D person re-identificationMd Kamal Uddin0Antony Lam1Hisato Fukuda2Yoshinori Kobayashi3Yoshinori Kuno4Graduate School of Science and Engineering, Saitama University, Saitama, Japan; Noakhali Science and Technology University, Noakhali, Bangladesh; Corresponding author. Graduate School of Science and Engineering, Saitama University, Saitama, Japan.Mercari, Inc., Tokyo, JapanGraduate School of Science and Engineering, Saitama University, Saitama, JapanGraduate School of Science and Engineering, Saitama University, Saitama, JapanGraduate School of Science and Engineering, Saitama University, Saitama, JapanPerson re-identification (Re-id) is the task of recognizing people across different non-overlapping sensors of a camera network. Despite the recent advances with deep learning (DL) models for multi-modal fusion, state-of-the-art Re-id approaches fail to leverage the depth guided contextual information to dynamically select the most discriminant convolutional filters for better feature embedding and inference. Thanks to low cost modern RGB-D sensors (e.g. Microsoft Kinect and Intel RealSense Depth camera) that avail us with different modalities such as illumination invariant high-quality depth images, RGB images and skeleton information can be obtained simultaneously. State-of-the-art Re-id approaches utilize multi-modal fusion in feature space where the chances of fused noisy features to dominate the final recognition process is high. In this paper, we address this issue by exploiting the advantage of using an effective fusion technique in dissimilarity space. Given a query RGB-D image of an individual, two CNNs are separately trained with 3-channel RGB and 4-channel RGB-D images to produce two different feature embeddings required for pair-wise matching with embeddings for reference images, where dissimilarity scores w.r.t reference images from both modalities are fused together for final ranking. Additionally, lack of a proper RGB-D Re-id dataset prompts us to contribute a new RGB-D Re-id dataset named SUCVL RGBD-ID, including RGB and depth images of 58 identities from three cameras where one camera was installed in poor illumination conditions and the remaining two cameras were installed in two different indoor locations with different indoor lighting environments. Extensive experimental analysis on our dataset and two publicly available datasets show the effectiveness of our proposed method. Moreover, our proposed method is general and can be applied to a multitude of different RGB-D based applications.http://www.sciencedirect.com/science/article/pii/S2590005621000369Re-identificationRGB-D sensorsDissimilarity spaceTriplet loss
spellingShingle	Md Kamal Uddin Antony Lam Hisato Fukuda Yoshinori Kobayashi Yoshinori Kuno Fusion in dissimilarity space for RGB-D person re-identification Array Re-identification RGB-D sensors Dissimilarity space Triplet loss
title	Fusion in dissimilarity space for RGB-D person re-identification
title_full	Fusion in dissimilarity space for RGB-D person re-identification
title_fullStr	Fusion in dissimilarity space for RGB-D person re-identification
title_full_unstemmed	Fusion in dissimilarity space for RGB-D person re-identification
title_short	Fusion in dissimilarity space for RGB-D person re-identification
title_sort	fusion in dissimilarity space for rgb d person re identification
topic	Re-identification RGB-D sensors Dissimilarity space Triplet loss
url	http://www.sciencedirect.com/science/article/pii/S2590005621000369
work_keys_str_mv	AT mdkamaluddin fusionindissimilarityspaceforrgbdpersonreidentification AT antonylam fusionindissimilarityspaceforrgbdpersonreidentification AT hisatofukuda fusionindissimilarityspaceforrgbdpersonreidentification AT yoshinorikobayashi fusionindissimilarityspaceforrgbdpersonreidentification AT yoshinorikuno fusionindissimilarityspaceforrgbdpersonreidentification

Fusion in dissimilarity space for RGB-D person re-identification

Similar Items