Siamese Network Tracker Based on Multi-Scale Feature Fusion

The main task in visual object tracking is to track a moving object in an image sequence. In this process, the object’s trajectory and behavior can be described by calculating the object’s position, velocity, acceleration, and other parameters or by memorizing the position of the object in each fram...

Full description

Bibliographic Details
Main Authors:	Jiaxu Zhao, Dapeng Niu
Format:	Article
Language:	English
Published:	MDPI AG 2023-08-01
Series:	Systems
Subjects:	visual object tracking automated driving deep learning artificial intelligence computer vision
Online Access:	https://www.mdpi.com/2079-8954/11/8/434

_version_	1797583146109108224
author	Jiaxu Zhao Dapeng Niu
author_facet	Jiaxu Zhao Dapeng Niu
author_sort	Jiaxu Zhao
collection	DOAJ
description	The main task in visual object tracking is to track a moving object in an image sequence. In this process, the object’s trajectory and behavior can be described by calculating the object’s position, velocity, acceleration, and other parameters or by memorizing the position of the object in each frame of the corresponding video. Therefore, visual object tracking can complete many more advanced tasks, has great performance in relation to real scenes, and is widely used in automated driving, traffic monitoring, human–computer interaction, and so on. Siamese-network-based trackers have been receiving a great deal of attention from the tracking community, but they have many drawbacks. This paper analyzes the shortcomings of the Siamese network tracker in detail, uses the method of feature multi-scale fusion to improve the Siamese network tracker, and proposes a new target-tracking framework to address its shortcomings. In this paper, a feature map with low-resolution but strong semantic information and a feature map with high-resolution and rich spatial information are integrated to improve the model’s ability to depict an object, and the problem of scale change is solved by fusing features at different scales. Furthermore, we utilize the 3D Max Filtering module to suppress repeated predictions of features at different scales. Finally, our experiments conducted on the four tracking benchmarks OTB2015, VOT2016, VOT2018, and GOT10K show that the proposed algorithm effectively improves the tracking accuracy and robustness of the system.
first_indexed	2024-03-10T23:32:41Z
format	Article
id	doaj.art-79ae715b22af435f92ca68e4af17c9d1
institution	Directory Open Access Journal
issn	2079-8954
language	English
last_indexed	2024-03-10T23:32:41Z
publishDate	2023-08-01
publisher	MDPI AG
record_format	Article
series	Systems
spelling	doaj.art-79ae715b22af435f92ca68e4af17c9d12023-11-19T03:13:08ZengMDPI AGSystems2079-89542023-08-0111843410.3390/systems11080434Siamese Network Tracker Based on Multi-Scale Feature FusionJiaxu Zhao0Dapeng Niu1College of Information Science and Engineering, Northeastern University, Shenyang 110819, ChinaCollege of Information Science and Engineering, Northeastern University, Shenyang 110819, ChinaThe main task in visual object tracking is to track a moving object in an image sequence. In this process, the object’s trajectory and behavior can be described by calculating the object’s position, velocity, acceleration, and other parameters or by memorizing the position of the object in each frame of the corresponding video. Therefore, visual object tracking can complete many more advanced tasks, has great performance in relation to real scenes, and is widely used in automated driving, traffic monitoring, human–computer interaction, and so on. Siamese-network-based trackers have been receiving a great deal of attention from the tracking community, but they have many drawbacks. This paper analyzes the shortcomings of the Siamese network tracker in detail, uses the method of feature multi-scale fusion to improve the Siamese network tracker, and proposes a new target-tracking framework to address its shortcomings. In this paper, a feature map with low-resolution but strong semantic information and a feature map with high-resolution and rich spatial information are integrated to improve the model’s ability to depict an object, and the problem of scale change is solved by fusing features at different scales. Furthermore, we utilize the 3D Max Filtering module to suppress repeated predictions of features at different scales. Finally, our experiments conducted on the four tracking benchmarks OTB2015, VOT2016, VOT2018, and GOT10K show that the proposed algorithm effectively improves the tracking accuracy and robustness of the system.https://www.mdpi.com/2079-8954/11/8/434visual object trackingautomated drivingdeep learningartificial intelligencecomputer vision
spellingShingle	Jiaxu Zhao Dapeng Niu Siamese Network Tracker Based on Multi-Scale Feature Fusion Systems visual object tracking automated driving deep learning artificial intelligence computer vision
title	Siamese Network Tracker Based on Multi-Scale Feature Fusion
title_full	Siamese Network Tracker Based on Multi-Scale Feature Fusion
title_fullStr	Siamese Network Tracker Based on Multi-Scale Feature Fusion
title_full_unstemmed	Siamese Network Tracker Based on Multi-Scale Feature Fusion
title_short	Siamese Network Tracker Based on Multi-Scale Feature Fusion
title_sort	siamese network tracker based on multi scale feature fusion
topic	visual object tracking automated driving deep learning artificial intelligence computer vision
url	https://www.mdpi.com/2079-8954/11/8/434
work_keys_str_mv	AT jiaxuzhao siamesenetworktrackerbasedonmultiscalefeaturefusion AT dapengniu siamesenetworktrackerbasedonmultiscalefeaturefusion

Siamese Network Tracker Based on Multi-Scale Feature Fusion

Similar Items