Do different tracking tasks require different appearance models?

Tracking objects of interest in a video is one of the most popular and widely applicable problems in computer vision. However, with the years, a Cambrian explosion of use cases and benchmarks has fragmented the problem in a multitude of different experimental setups. As a consequence, the literature...

Full description

Bibliographic Details
Main Authors: Wang, Z, Zhao, H, Li, Y-L, Wang, S, Torr, P, Bertinetto, L
Format: Conference item
Language:English
Published: Curran Associates 2022
_version_ 1817931295937789952
author Wang, Z
Zhao, H
Li, Y-L
Wang, S
Torr, P
Bertinetto, L
author_facet Wang, Z
Zhao, H
Li, Y-L
Wang, S
Torr, P
Bertinetto, L
author_sort Wang, Z
collection OXFORD
description Tracking objects of interest in a video is one of the most popular and widely applicable problems in computer vision. However, with the years, a Cambrian explosion of use cases and benchmarks has fragmented the problem in a multitude of different experimental setups. As a consequence, the literature has fragmented too, and now novel approaches proposed by the community are usually specialised to fit only one specific setup. To understand to what extent this specialisation is necessary, in this work we present UniTrack, a solution to address five different tasks within the same framework. UniTrack consists of a single and task-agnostic appearance model, which can be learned in a supervised or self-supervised fashion, and multiple heads'' that address individual tasks and do not require training. We show how most tracking tasks can be solved within this framework, and that the same appearance model can be successfully used to obtain results that are competitive against specialised methods for most of the tasks considered. The framework also allows us to analyse appearance models obtained with the most recent self-supervised methods, thus extending their evaluation and comparison to a larger variety of important problems.
first_indexed 2024-03-07T07:31:15Z
format Conference item
id oxford-uuid:fed7dd27-96db-4514-859e-622f014aff3f
institution University of Oxford
language English
last_indexed 2024-12-09T03:19:45Z
publishDate 2022
publisher Curran Associates
record_format dspace
spelling oxford-uuid:fed7dd27-96db-4514-859e-622f014aff3f2024-10-31T11:53:33ZDo different tracking tasks require different appearance models?Conference itemhttp://purl.org/coar/resource_type/c_5794uuid:fed7dd27-96db-4514-859e-622f014aff3fEnglishSymplectic ElementsCurran Associates2022Wang, ZZhao, HLi, Y-LWang, STorr, PBertinetto, LTracking objects of interest in a video is one of the most popular and widely applicable problems in computer vision. However, with the years, a Cambrian explosion of use cases and benchmarks has fragmented the problem in a multitude of different experimental setups. As a consequence, the literature has fragmented too, and now novel approaches proposed by the community are usually specialised to fit only one specific setup. To understand to what extent this specialisation is necessary, in this work we present UniTrack, a solution to address five different tasks within the same framework. UniTrack consists of a single and task-agnostic appearance model, which can be learned in a supervised or self-supervised fashion, and multiple heads'' that address individual tasks and do not require training. We show how most tracking tasks can be solved within this framework, and that the same appearance model can be successfully used to obtain results that are competitive against specialised methods for most of the tasks considered. The framework also allows us to analyse appearance models obtained with the most recent self-supervised methods, thus extending their evaluation and comparison to a larger variety of important problems.
spellingShingle Wang, Z
Zhao, H
Li, Y-L
Wang, S
Torr, P
Bertinetto, L
Do different tracking tasks require different appearance models?
title Do different tracking tasks require different appearance models?
title_full Do different tracking tasks require different appearance models?
title_fullStr Do different tracking tasks require different appearance models?
title_full_unstemmed Do different tracking tasks require different appearance models?
title_short Do different tracking tasks require different appearance models?
title_sort do different tracking tasks require different appearance models
work_keys_str_mv AT wangz dodifferenttrackingtasksrequiredifferentappearancemodels
AT zhaoh dodifferenttrackingtasksrequiredifferentappearancemodels
AT liyl dodifferenttrackingtasksrequiredifferentappearancemodels
AT wangs dodifferenttrackingtasksrequiredifferentappearancemodels
AT torrp dodifferenttrackingtasksrequiredifferentappearancemodels
AT bertinettol dodifferenttrackingtasksrequiredifferentappearancemodels