Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System

Person Re-Identification is an essential task in computer vision, particularly in surveillance applications. The aim is to identify a person based on an input image from surveillance photographs in various scenarios. Most Person re-ID techniques utilize Convolutional Neural Networks (CNNs); however,...

Full description

Bibliographic Details
Main Authors: Muhammad Tahir, Saeed Anwar
Format: Article
Language:English
Published: MDPI AG 2021-10-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/19/9197
_version_ 1827680919094296576
author Muhammad Tahir
Saeed Anwar
author_facet Muhammad Tahir
Saeed Anwar
author_sort Muhammad Tahir
collection DOAJ
description Person Re-Identification is an essential task in computer vision, particularly in surveillance applications. The aim is to identify a person based on an input image from surveillance photographs in various scenarios. Most Person re-ID techniques utilize Convolutional Neural Networks (CNNs); however, Vision Transformers are replacing pure CNNs for various computer vision tasks such as object recognition, classification, etc. The vision transformers contain information about local regions of the image. The current techniques take this advantage to improve the accuracy of the tasks underhand. We propose to use the vision transformers in conjunction with vanilla CNN models to investigate the true strength of transformers in person re-identification. We employ three backbones with different combinations of vision transformers on two benchmark datasets. The overall performance of the backbones increased, showing the importance of vision transformers. We provide ablation studies and show the importance of various components of the vision transformers in re-identification tasks.
first_indexed 2024-03-10T07:06:07Z
format Article
id doaj.art-cff1b127d7aa44269614d5afd6c1be3d
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T07:06:07Z
publishDate 2021-10-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-cff1b127d7aa44269614d5afd6c1be3d2023-11-22T15:49:10ZengMDPI AGApplied Sciences2076-34172021-10-011119919710.3390/app11199197Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance SystemMuhammad Tahir0Saeed Anwar1College of Computing and Informatics, Saudi Electronic University, Riyadh 11673, Saudi ArabiaData61-Commonwealth Scientific and Industrial Research Organization(CSIRO), Clayton South, VIC 3169, AustraliaPerson Re-Identification is an essential task in computer vision, particularly in surveillance applications. The aim is to identify a person based on an input image from surveillance photographs in various scenarios. Most Person re-ID techniques utilize Convolutional Neural Networks (CNNs); however, Vision Transformers are replacing pure CNNs for various computer vision tasks such as object recognition, classification, etc. The vision transformers contain information about local regions of the image. The current techniques take this advantage to improve the accuracy of the tasks underhand. We propose to use the vision transformers in conjunction with vanilla CNN models to investigate the true strength of transformers in person re-identification. We employ three backbones with different combinations of vision transformers on two benchmark datasets. The overall performance of the backbones increased, showing the importance of vision transformers. We provide ablation studies and show the importance of various components of the vision transformers in re-identification tasks.https://www.mdpi.com/2076-3417/11/19/9197vision transformersdeep learningre-IDimage retrievalmulti-camera surveillance systempedestrian identification
spellingShingle Muhammad Tahir
Saeed Anwar
Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System
Applied Sciences
vision transformers
deep learning
re-ID
image retrieval
multi-camera surveillance system
pedestrian identification
title Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System
title_full Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System
title_fullStr Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System
title_full_unstemmed Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System
title_short Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System
title_sort transformers in pedestrian image retrieval and person re identification in a multi camera surveillance system
topic vision transformers
deep learning
re-ID
image retrieval
multi-camera surveillance system
pedestrian identification
url https://www.mdpi.com/2076-3417/11/19/9197
work_keys_str_mv AT muhammadtahir transformersinpedestrianimageretrievalandpersonreidentificationinamulticamerasurveillancesystem
AT saeedanwar transformersinpedestrianimageretrievalandpersonreidentificationinamulticamerasurveillancesystem