Unsupervised Domain Adaptive Person Re-Identification Method Based on Transformer

Person re-identification (ReID) is the problem of cross-camera target retrieval. The extraction of robust and discriminant features is the key factor in realizing the correct correlation of targets. A model based on convolutional neural networks (CNNs) can extract more robust image features. Still,...

Full description

Bibliographic Details
Main Authors:	Xiai Yan, Shengkai Ding, Wei Zhou, Weiqi Shi, Hua Tian
Format:	Article
Language:	English
Published:	MDPI AG 2022-09-01
Series:	Electronics
Subjects:	person re-identification (ReID) unsupervised domain adaptive transformer vision transformer
Online Access:	https://www.mdpi.com/2079-9292/11/19/3082

_version_	1797479864767348736
author	Xiai Yan Shengkai Ding Wei Zhou Weiqi Shi Hua Tian
author_facet	Xiai Yan Shengkai Ding Wei Zhou Weiqi Shi Hua Tian
author_sort	Xiai Yan
collection	DOAJ
description	Person re-identification (ReID) is the problem of cross-camera target retrieval. The extraction of robust and discriminant features is the key factor in realizing the correct correlation of targets. A model based on convolutional neural networks (CNNs) can extract more robust image features. Still, it completes the extraction of images from local information to global information by continuously accumulating convolution layers. As a complex CNN, a vision transformer (ViT) captures global information from the beginning to extract more powerful features. This paper proposes an unsupervised domain adaptive person re-identification model (ViTReID) based on the vision transformer, taking the ViT model trained on ImageNet as the pre-training weight and a transformer encoder as the feature extraction network, which makes up for some defects of the CNN model. At the same time, the combined loss function of cross-entropy and triplet loss function combined with the center loss function is used to optimize the network; the person’s head is evaluated and trained as a local feature combined with the global feature of the whole body, focusing on the head, to enhance the head feature information. The experimental results show that ViTReID exceeds the baseline method (SSG) by 14% (Market1501 → MSMT17) in mean average precision (mAP). In MSMT17 → Market1501, ViTReID is 1.2% higher in rank-1 (R1) accuracy than a state-of-the-art method (SPCL); in PersonX → MSMT17, the mAP is 3.1% higher than that of the MMT-dbscan method, and in PersonX → Market1501, the mAP is 1.5% higher than that of the MMT-dbscan method.
first_indexed	2024-03-09T21:51:56Z
format	Article
id	doaj.art-7b141a3168a94cecab403968df9f46d4
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-09T21:51:56Z
publishDate	2022-09-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-7b141a3168a94cecab403968df9f46d42023-11-23T20:06:00ZengMDPI AGElectronics2079-92922022-09-011119308210.3390/electronics11193082Unsupervised Domain Adaptive Person Re-Identification Method Based on TransformerXiai Yan0Shengkai Ding1Wei Zhou2Weiqi Shi3Hua Tian4School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan 411105, ChinaSchool of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan 411105, ChinaSchool of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan 411105, ChinaDepartment of Information Technology, Hunan Police Academy, Changsha 410138, ChinaDepartment of Information Technology, Hunan Police Academy, Changsha 410138, ChinaPerson re-identification (ReID) is the problem of cross-camera target retrieval. The extraction of robust and discriminant features is the key factor in realizing the correct correlation of targets. A model based on convolutional neural networks (CNNs) can extract more robust image features. Still, it completes the extraction of images from local information to global information by continuously accumulating convolution layers. As a complex CNN, a vision transformer (ViT) captures global information from the beginning to extract more powerful features. This paper proposes an unsupervised domain adaptive person re-identification model (ViTReID) based on the vision transformer, taking the ViT model trained on ImageNet as the pre-training weight and a transformer encoder as the feature extraction network, which makes up for some defects of the CNN model. At the same time, the combined loss function of cross-entropy and triplet loss function combined with the center loss function is used to optimize the network; the person’s head is evaluated and trained as a local feature combined with the global feature of the whole body, focusing on the head, to enhance the head feature information. The experimental results show that ViTReID exceeds the baseline method (SSG) by 14% (Market1501 → MSMT17) in mean average precision (mAP). In MSMT17 → Market1501, ViTReID is 1.2% higher in rank-1 (R1) accuracy than a state-of-the-art method (SPCL); in PersonX → MSMT17, the mAP is 3.1% higher than that of the MMT-dbscan method, and in PersonX → Market1501, the mAP is 1.5% higher than that of the MMT-dbscan method.https://www.mdpi.com/2079-9292/11/19/3082person re-identification (ReID)unsupervised domain adaptivetransformervision transformer
spellingShingle	Xiai Yan Shengkai Ding Wei Zhou Weiqi Shi Hua Tian Unsupervised Domain Adaptive Person Re-Identification Method Based on Transformer Electronics person re-identification (ReID) unsupervised domain adaptive transformer vision transformer
title	Unsupervised Domain Adaptive Person Re-Identification Method Based on Transformer
title_full	Unsupervised Domain Adaptive Person Re-Identification Method Based on Transformer
title_fullStr	Unsupervised Domain Adaptive Person Re-Identification Method Based on Transformer
title_full_unstemmed	Unsupervised Domain Adaptive Person Re-Identification Method Based on Transformer
title_short	Unsupervised Domain Adaptive Person Re-Identification Method Based on Transformer
title_sort	unsupervised domain adaptive person re identification method based on transformer
topic	person re-identification (ReID) unsupervised domain adaptive transformer vision transformer
url	https://www.mdpi.com/2079-9292/11/19/3082
work_keys_str_mv	AT xiaiyan unsuperviseddomainadaptivepersonreidentificationmethodbasedontransformer AT shengkaiding unsuperviseddomainadaptivepersonreidentificationmethodbasedontransformer AT weizhou unsuperviseddomainadaptivepersonreidentificationmethodbasedontransformer AT weiqishi unsuperviseddomainadaptivepersonreidentificationmethodbasedontransformer AT huatian unsuperviseddomainadaptivepersonreidentificationmethodbasedontransformer

Unsupervised Domain Adaptive Person Re-Identification Method Based on Transformer

Similar Items