Evolution of Siamese Visual Tracking with Slot Attention
Siamese network object tracking is a widely employed tracking method due to its simplicity and effectiveness. It first employs a two-stream network to independently extract template and search region features. Subsequently, these features are then combined through feature association to yield object...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-01-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/13/3/586 |
_version_ | 1797318866396774400 |
---|---|
author | Jian Wang Xiangzhou Ye Dongjie Wu Jinfu Gong Xinyi Tang Zheng Li |
author_facet | Jian Wang Xiangzhou Ye Dongjie Wu Jinfu Gong Xinyi Tang Zheng Li |
author_sort | Jian Wang |
collection | DOAJ |
description | Siamese network object tracking is a widely employed tracking method due to its simplicity and effectiveness. It first employs a two-stream network to independently extract template and search region features. Subsequently, these features are then combined through feature association to yield object information within the visual scene. However, the conventional approach faces limitations when it leverages the template features as a convolution kernel to convolve the search image features, which restricts the ability to capture complex and nonlinear feature transformations of objects, thereby restricting its discriminative capabilities. To overcome this challenge, we propose replacing traditional convolutional correlation with Slot Attention for feature association. This novel approach enables the effective extraction of nonlinear features within the scene, while augmenting the discriminative capacity. Furthermore, to increase the inference efficiency and reduce the parameter occupation, we suggest deploying a single Slot Attention module for multiple associations. Our tracking algorithm, SiamSlot, was evaluated on diverse benchmarks, including VOT2019, GOT-10k, UAV123, and Nfs. The experiments show a remarkable improvement in performance relative to previous methods under the same network size. |
first_indexed | 2024-03-08T03:58:36Z |
format | Article |
id | doaj.art-5e81e9633bf7422fb439724980b7821f |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-08T03:58:36Z |
publishDate | 2024-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-5e81e9633bf7422fb439724980b7821f2024-02-09T15:10:43ZengMDPI AGElectronics2079-92922024-01-0113358610.3390/electronics13030586Evolution of Siamese Visual Tracking with Slot AttentionJian Wang0Xiangzhou Ye1Dongjie Wu2Jinfu Gong3Xinyi Tang4Zheng Li5Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaShanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaSiamese network object tracking is a widely employed tracking method due to its simplicity and effectiveness. It first employs a two-stream network to independently extract template and search region features. Subsequently, these features are then combined through feature association to yield object information within the visual scene. However, the conventional approach faces limitations when it leverages the template features as a convolution kernel to convolve the search image features, which restricts the ability to capture complex and nonlinear feature transformations of objects, thereby restricting its discriminative capabilities. To overcome this challenge, we propose replacing traditional convolutional correlation with Slot Attention for feature association. This novel approach enables the effective extraction of nonlinear features within the scene, while augmenting the discriminative capacity. Furthermore, to increase the inference efficiency and reduce the parameter occupation, we suggest deploying a single Slot Attention module for multiple associations. Our tracking algorithm, SiamSlot, was evaluated on diverse benchmarks, including VOT2019, GOT-10k, UAV123, and Nfs. The experiments show a remarkable improvement in performance relative to previous methods under the same network size.https://www.mdpi.com/2079-9292/13/3/586visual object tracking (VOT)feature associationnonlinear featureSlot Attention |
spellingShingle | Jian Wang Xiangzhou Ye Dongjie Wu Jinfu Gong Xinyi Tang Zheng Li Evolution of Siamese Visual Tracking with Slot Attention Electronics visual object tracking (VOT) feature association nonlinear feature Slot Attention |
title | Evolution of Siamese Visual Tracking with Slot Attention |
title_full | Evolution of Siamese Visual Tracking with Slot Attention |
title_fullStr | Evolution of Siamese Visual Tracking with Slot Attention |
title_full_unstemmed | Evolution of Siamese Visual Tracking with Slot Attention |
title_short | Evolution of Siamese Visual Tracking with Slot Attention |
title_sort | evolution of siamese visual tracking with slot attention |
topic | visual object tracking (VOT) feature association nonlinear feature Slot Attention |
url | https://www.mdpi.com/2079-9292/13/3/586 |
work_keys_str_mv | AT jianwang evolutionofsiamesevisualtrackingwithslotattention AT xiangzhouye evolutionofsiamesevisualtrackingwithslotattention AT dongjiewu evolutionofsiamesevisualtrackingwithslotattention AT jinfugong evolutionofsiamesevisualtrackingwithslotattention AT xinyitang evolutionofsiamesevisualtrackingwithslotattention AT zhengli evolutionofsiamesevisualtrackingwithslotattention |