Evolution of Siamese Visual Tracking with Slot Attention

Siamese network object tracking is a widely employed tracking method due to its simplicity and effectiveness. It first employs a two-stream network to independently extract template and search region features. Subsequently, these features are then combined through feature association to yield object...

Full description

Bibliographic Details
Main Authors: Jian Wang, Xiangzhou Ye, Dongjie Wu, Jinfu Gong, Xinyi Tang, Zheng Li
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/13/3/586
_version_ 1797318866396774400
author Jian Wang
Xiangzhou Ye
Dongjie Wu
Jinfu Gong
Xinyi Tang
Zheng Li
author_facet Jian Wang
Xiangzhou Ye
Dongjie Wu
Jinfu Gong
Xinyi Tang
Zheng Li
author_sort Jian Wang
collection DOAJ
description Siamese network object tracking is a widely employed tracking method due to its simplicity and effectiveness. It first employs a two-stream network to independently extract template and search region features. Subsequently, these features are then combined through feature association to yield object information within the visual scene. However, the conventional approach faces limitations when it leverages the template features as a convolution kernel to convolve the search image features, which restricts the ability to capture complex and nonlinear feature transformations of objects, thereby restricting its discriminative capabilities. To overcome this challenge, we propose replacing traditional convolutional correlation with Slot Attention for feature association. This novel approach enables the effective extraction of nonlinear features within the scene, while augmenting the discriminative capacity. Furthermore, to increase the inference efficiency and reduce the parameter occupation, we suggest deploying a single Slot Attention module for multiple associations. Our tracking algorithm, SiamSlot, was evaluated on diverse benchmarks, including VOT2019, GOT-10k, UAV123, and Nfs. The experiments show a remarkable improvement in performance relative to previous methods under the same network size.
first_indexed 2024-03-08T03:58:36Z
format Article
id doaj.art-5e81e9633bf7422fb439724980b7821f
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-08T03:58:36Z
publishDate 2024-01-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-5e81e9633bf7422fb439724980b7821f2024-02-09T15:10:43ZengMDPI AGElectronics2079-92922024-01-0113358610.3390/electronics13030586Evolution of Siamese Visual Tracking with Slot AttentionJian Wang0Xiangzhou Ye1Dongjie Wu2Jinfu Gong3Xinyi Tang4Zheng Li5Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaSiamese network object tracking is a widely employed tracking method due to its simplicity and effectiveness. It first employs a two-stream network to independently extract template and search region features. Subsequently, these features are then combined through feature association to yield object information within the visual scene. However, the conventional approach faces limitations when it leverages the template features as a convolution kernel to convolve the search image features, which restricts the ability to capture complex and nonlinear feature transformations of objects, thereby restricting its discriminative capabilities. To overcome this challenge, we propose replacing traditional convolutional correlation with Slot Attention for feature association. This novel approach enables the effective extraction of nonlinear features within the scene, while augmenting the discriminative capacity. Furthermore, to increase the inference efficiency and reduce the parameter occupation, we suggest deploying a single Slot Attention module for multiple associations. Our tracking algorithm, SiamSlot, was evaluated on diverse benchmarks, including VOT2019, GOT-10k, UAV123, and Nfs. The experiments show a remarkable improvement in performance relative to previous methods under the same network size.https://www.mdpi.com/2079-9292/13/3/586visual object tracking (VOT)feature associationnonlinear featureSlot Attention
spellingShingle Jian Wang
Xiangzhou Ye
Dongjie Wu
Jinfu Gong
Xinyi Tang
Zheng Li
Evolution of Siamese Visual Tracking with Slot Attention
Electronics
visual object tracking (VOT)
feature association
nonlinear feature
Slot Attention
title Evolution of Siamese Visual Tracking with Slot Attention
title_full Evolution of Siamese Visual Tracking with Slot Attention
title_fullStr Evolution of Siamese Visual Tracking with Slot Attention
title_full_unstemmed Evolution of Siamese Visual Tracking with Slot Attention
title_short Evolution of Siamese Visual Tracking with Slot Attention
title_sort evolution of siamese visual tracking with slot attention
topic visual object tracking (VOT)
feature association
nonlinear feature
Slot Attention
url https://www.mdpi.com/2079-9292/13/3/586
work_keys_str_mv AT jianwang evolutionofsiamesevisualtrackingwithslotattention
AT xiangzhouye evolutionofsiamesevisualtrackingwithslotattention
AT dongjiewu evolutionofsiamesevisualtrackingwithslotattention
AT jinfugong evolutionofsiamesevisualtrackingwithslotattention
AT xinyitang evolutionofsiamesevisualtrackingwithslotattention
AT zhengli evolutionofsiamesevisualtrackingwithslotattention