SiamLST: Learning Spatial and Channel-wise Transform for Visual Tracking
Siamese network based trackers regard visual tracking as a similarity matching task between the target template and search region patches, and achieve a good balance between accuracy and speed in recent years. However, existing trackers do not effectively exploit the spatial and inter-channel cues,...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek
2022-01-01
|
Series: | Tehnički Vjesnik |
Subjects: | |
Online Access: | https://hrcak.srce.hr/file/404817 |
_version_ | 1797206761478815744 |
---|---|
author | Jun Wang Limin Zhang Yuanyun Wang Changwang Lai Wenhui Yang Chengzhi Deng |
author_facet | Jun Wang Limin Zhang Yuanyun Wang Changwang Lai Wenhui Yang Chengzhi Deng |
author_sort | Jun Wang |
collection | DOAJ |
description | Siamese network based trackers regard visual tracking as a similarity matching task between the target template and search region patches, and achieve a good balance between accuracy and speed in recent years. However, existing trackers do not effectively exploit the spatial and inter-channel cues, which lead to the redundancy of pre-trained model parameters. In this paper, we design a novel visual tracker based on a Learnable Spatial and Channel-wise Transform in Siamese network (SiamLST). The SiamLST tracker includes a powerful feature extraction backbone and an efficient cross-correlation method. The proposed algorithm takes full advantages of CNN and the learnable sparse transform module to represent the template and search patches, which effectively exploit the spatial and channel-wise correlations to deal with complicated scenarios, such as motion blur, in-plane rotation and partial occlusion. Experimental results conducted on multiple tracking benchmarks including OTB2015, VOT2016, GOT-10k and VOT2018 demonstrate that the proposed SiamLST has excellent tracking performances. |
first_indexed | 2024-04-24T09:12:09Z |
format | Article |
id | doaj.art-ff6b0d6f70c54f31b1f7eedcc895d05b |
institution | Directory Open Access Journal |
issn | 1330-3651 1848-6339 |
language | English |
last_indexed | 2024-04-24T09:12:09Z |
publishDate | 2022-01-01 |
publisher | Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek |
record_format | Article |
series | Tehnički Vjesnik |
spelling | doaj.art-ff6b0d6f70c54f31b1f7eedcc895d05b2024-04-15T17:46:15ZengFaculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in OsijekTehnički Vjesnik1330-36511848-63392022-01-012941202120910.17559/TV-20211115041517SiamLST: Learning Spatial and Channel-wise Transform for Visual TrackingJun Wang0Limin Zhang1Yuanyun Wang2Changwang Lai3Wenhui Yang4Chengzhi Deng5School of Information Engineering, Nanchang Institute of Technology, Nanchang, ChinaSchool of Information Engineering, Nanchang Institute of Technology, Nanchang, ChinaSchool of Information Engineering, Nanchang Institute of Technology, Nanchang, ChinaSchool of Information Engineering, Nanchang Institute of Technology, Nanchang, ChinaSchool of Information Engineering, Nanchang Institute of Technology, Nanchang, ChinaSchool of Information Engineering, Nanchang Institute of Technology, Nanchang, ChinaSiamese network based trackers regard visual tracking as a similarity matching task between the target template and search region patches, and achieve a good balance between accuracy and speed in recent years. However, existing trackers do not effectively exploit the spatial and inter-channel cues, which lead to the redundancy of pre-trained model parameters. In this paper, we design a novel visual tracker based on a Learnable Spatial and Channel-wise Transform in Siamese network (SiamLST). The SiamLST tracker includes a powerful feature extraction backbone and an efficient cross-correlation method. The proposed algorithm takes full advantages of CNN and the learnable sparse transform module to represent the template and search patches, which effectively exploit the spatial and channel-wise correlations to deal with complicated scenarios, such as motion blur, in-plane rotation and partial occlusion. Experimental results conducted on multiple tracking benchmarks including OTB2015, VOT2016, GOT-10k and VOT2018 demonstrate that the proposed SiamLST has excellent tracking performances.https://hrcak.srce.hr/file/404817deep learningsiamese networksparse transformvisual tracking |
spellingShingle | Jun Wang Limin Zhang Yuanyun Wang Changwang Lai Wenhui Yang Chengzhi Deng SiamLST: Learning Spatial and Channel-wise Transform for Visual Tracking Tehnički Vjesnik deep learning siamese network sparse transform visual tracking |
title | SiamLST: Learning Spatial and Channel-wise Transform for Visual Tracking |
title_full | SiamLST: Learning Spatial and Channel-wise Transform for Visual Tracking |
title_fullStr | SiamLST: Learning Spatial and Channel-wise Transform for Visual Tracking |
title_full_unstemmed | SiamLST: Learning Spatial and Channel-wise Transform for Visual Tracking |
title_short | SiamLST: Learning Spatial and Channel-wise Transform for Visual Tracking |
title_sort | siamlst learning spatial and channel wise transform for visual tracking |
topic | deep learning siamese network sparse transform visual tracking |
url | https://hrcak.srce.hr/file/404817 |
work_keys_str_mv | AT junwang siamlstlearningspatialandchannelwisetransformforvisualtracking AT liminzhang siamlstlearningspatialandchannelwisetransformforvisualtracking AT yuanyunwang siamlstlearningspatialandchannelwisetransformforvisualtracking AT changwanglai siamlstlearningspatialandchannelwisetransformforvisualtracking AT wenhuiyang siamlstlearningspatialandchannelwisetransformforvisualtracking AT chengzhideng siamlstlearningspatialandchannelwisetransformforvisualtracking |