Long-term tracking with transformer and template update
Abstract Aiming at the tracking failure due to the disappearance of the target in the long-term target tracking process, this paper proposes a long-term target tracking network based on the visual transformer and template update. First of all, we construct a feature extraction network based on the t...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2022-12-01
|
Series: | EURASIP Journal on Advances in Signal Processing |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13634-022-00954-4 |
_version_ | 1797973406020272128 |
---|---|
author | Hongying Zhang Xiaowen Peng Xuyong Wang |
author_facet | Hongying Zhang Xiaowen Peng Xuyong Wang |
author_sort | Hongying Zhang |
collection | DOAJ |
description | Abstract Aiming at the tracking failure due to the disappearance of the target in the long-term target tracking process, this paper proposes a long-term target tracking network based on the visual transformer and template update. First of all, we construct a feature extraction network based on the transformer and adopt a knowledge distillation strategy to improve the effectiveness of the network for global feature extraction. Secondly, in the modeling transformer, the target features are fully fused with the search area features by using encoder, and the position information in the target query is learned by the decoder. Then, target predictions are performed on the information from the encoder–decoder to obtain tracking results. Meanwhile, we design a score head model to judge the validity of the dynamic template of the current frame before tracking in the next frame. We select the appropriate dynamic template for the tracking of the next frame according to the score result. In this paper, we performed extensive experiments on LaSOT, VOT2021-LT, TrackingNet, TLP, and UAV123 datasets, and the experimental results prove the effectiveness of our method. In particular, it exceeds STARK by 0.8 $$\%$$ % (F score) on VOT2021-LT, 1.0 $$\%$$ % (S score) on LaSOT, and TrackingNet exceed STARK by 1.1 $$\%$$ % (NP score), which also demonstrates the superiority of the method in this paper. |
first_indexed | 2024-04-11T04:03:45Z |
format | Article |
id | doaj.art-2299711249ec4dae942d9efc07a3143b |
institution | Directory Open Access Journal |
issn | 1687-6180 |
language | English |
last_indexed | 2024-04-11T04:03:45Z |
publishDate | 2022-12-01 |
publisher | SpringerOpen |
record_format | Article |
series | EURASIP Journal on Advances in Signal Processing |
spelling | doaj.art-2299711249ec4dae942d9efc07a3143b2023-01-01T12:30:11ZengSpringerOpenEURASIP Journal on Advances in Signal Processing1687-61802022-12-012022111710.1186/s13634-022-00954-4Long-term tracking with transformer and template updateHongying Zhang0Xiaowen Peng1Xuyong Wang2Civil Aviation University of ChinaCivil Aviation University of ChinaCivil Aviation University of ChinaAbstract Aiming at the tracking failure due to the disappearance of the target in the long-term target tracking process, this paper proposes a long-term target tracking network based on the visual transformer and template update. First of all, we construct a feature extraction network based on the transformer and adopt a knowledge distillation strategy to improve the effectiveness of the network for global feature extraction. Secondly, in the modeling transformer, the target features are fully fused with the search area features by using encoder, and the position information in the target query is learned by the decoder. Then, target predictions are performed on the information from the encoder–decoder to obtain tracking results. Meanwhile, we design a score head model to judge the validity of the dynamic template of the current frame before tracking in the next frame. We select the appropriate dynamic template for the tracking of the next frame according to the score result. In this paper, we performed extensive experiments on LaSOT, VOT2021-LT, TrackingNet, TLP, and UAV123 datasets, and the experimental results prove the effectiveness of our method. In particular, it exceeds STARK by 0.8 $$\%$$ % (F score) on VOT2021-LT, 1.0 $$\%$$ % (S score) on LaSOT, and TrackingNet exceed STARK by 1.1 $$\%$$ % (NP score), which also demonstrates the superiority of the method in this paper.https://doi.org/10.1186/s13634-022-00954-4TransformerLong-term trackingTemplate update |
spellingShingle | Hongying Zhang Xiaowen Peng Xuyong Wang Long-term tracking with transformer and template update EURASIP Journal on Advances in Signal Processing Transformer Long-term tracking Template update |
title | Long-term tracking with transformer and template update |
title_full | Long-term tracking with transformer and template update |
title_fullStr | Long-term tracking with transformer and template update |
title_full_unstemmed | Long-term tracking with transformer and template update |
title_short | Long-term tracking with transformer and template update |
title_sort | long term tracking with transformer and template update |
topic | Transformer Long-term tracking Template update |
url | https://doi.org/10.1186/s13634-022-00954-4 |
work_keys_str_mv | AT hongyingzhang longtermtrackingwithtransformerandtemplateupdate AT xiaowenpeng longtermtrackingwithtransformerandtemplateupdate AT xuyongwang longtermtrackingwithtransformerandtemplateupdate |