FusionTrack: Multiple Object Tracking with Enhanced Information Utilization

Multi-object tracking (MOT) is one of the significant directions of computer vision. Though existing methods can solve simple tasks like pedestrian tracking well, some complex downstream tasks featuring uniform appearance and diverse motion remain difficult. Inspired by DETR, the tracking-by-attenti...

Full description

Bibliographic Details
Main Authors: Yifan Yang, Ziqi He, Jiaxu Wan, Ding Yuan, Hanyang Liu, Xuliang Li, Hong Zhang
Format: Article
Language:English
Published: MDPI AG 2023-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/14/8010
_version_ 1827734193698766848
author Yifan Yang
Ziqi He
Jiaxu Wan
Ding Yuan
Hanyang Liu
Xuliang Li
Hong Zhang
author_facet Yifan Yang
Ziqi He
Jiaxu Wan
Ding Yuan
Hanyang Liu
Xuliang Li
Hong Zhang
author_sort Yifan Yang
collection DOAJ
description Multi-object tracking (MOT) is one of the significant directions of computer vision. Though existing methods can solve simple tasks like pedestrian tracking well, some complex downstream tasks featuring uniform appearance and diverse motion remain difficult. Inspired by DETR, the tracking-by-attention (TBA) method uses transformers to accomplish multi-object tracking tasks. However, there are still issues with existing TBA methods within the TBA paradigm, such as difficulty detecting and tracking objects due to gradient conflict in shared parameters, and insufficient use of features to distinguish similar objects. We introduce FusionTrack to address these issues. It utilizes a joint track-detection decoder and a score-guided multi-level query fuser to enhance the usage of information within and between frames. With these improvements, FusionTrack achieves 11.1% higher by HOTA metric on the DanceTrack dataset compared with the baseline model MOTR.
first_indexed 2024-03-11T01:21:58Z
format Article
id doaj.art-d410bfea1d9644488a59eadc1f877a6a
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T01:21:58Z
publishDate 2023-07-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-d410bfea1d9644488a59eadc1f877a6a2023-11-18T18:06:50ZengMDPI AGApplied Sciences2076-34172023-07-011314801010.3390/app13148010FusionTrack: Multiple Object Tracking with Enhanced Information UtilizationYifan Yang0Ziqi He1Jiaxu Wan2Ding Yuan3Hanyang Liu4Xuliang Li5Hong Zhang6Institute of Artificial Intelligence, Beihang University, Beijing 100191, ChinaInstitute of Artificial Intelligence, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaMulti-object tracking (MOT) is one of the significant directions of computer vision. Though existing methods can solve simple tasks like pedestrian tracking well, some complex downstream tasks featuring uniform appearance and diverse motion remain difficult. Inspired by DETR, the tracking-by-attention (TBA) method uses transformers to accomplish multi-object tracking tasks. However, there are still issues with existing TBA methods within the TBA paradigm, such as difficulty detecting and tracking objects due to gradient conflict in shared parameters, and insufficient use of features to distinguish similar objects. We introduce FusionTrack to address these issues. It utilizes a joint track-detection decoder and a score-guided multi-level query fuser to enhance the usage of information within and between frames. With these improvements, FusionTrack achieves 11.1% higher by HOTA metric on the DanceTrack dataset compared with the baseline model MOTR.https://www.mdpi.com/2076-3417/13/14/8010multiple-object trackingobject detectioncomputer visiontransformer
spellingShingle Yifan Yang
Ziqi He
Jiaxu Wan
Ding Yuan
Hanyang Liu
Xuliang Li
Hong Zhang
FusionTrack: Multiple Object Tracking with Enhanced Information Utilization
Applied Sciences
multiple-object tracking
object detection
computer vision
transformer
title FusionTrack: Multiple Object Tracking with Enhanced Information Utilization
title_full FusionTrack: Multiple Object Tracking with Enhanced Information Utilization
title_fullStr FusionTrack: Multiple Object Tracking with Enhanced Information Utilization
title_full_unstemmed FusionTrack: Multiple Object Tracking with Enhanced Information Utilization
title_short FusionTrack: Multiple Object Tracking with Enhanced Information Utilization
title_sort fusiontrack multiple object tracking with enhanced information utilization
topic multiple-object tracking
object detection
computer vision
transformer
url https://www.mdpi.com/2076-3417/13/14/8010
work_keys_str_mv AT yifanyang fusiontrackmultipleobjecttrackingwithenhancedinformationutilization
AT ziqihe fusiontrackmultipleobjecttrackingwithenhancedinformationutilization
AT jiaxuwan fusiontrackmultipleobjecttrackingwithenhancedinformationutilization
AT dingyuan fusiontrackmultipleobjecttrackingwithenhancedinformationutilization
AT hanyangliu fusiontrackmultipleobjecttrackingwithenhancedinformationutilization
AT xuliangli fusiontrackmultipleobjecttrackingwithenhancedinformationutilization
AT hongzhang fusiontrackmultipleobjecttrackingwithenhancedinformationutilization