FusionTrack: Multiple Object Tracking with Enhanced Information Utilization
Multi-object tracking (MOT) is one of the significant directions of computer vision. Though existing methods can solve simple tasks like pedestrian tracking well, some complex downstream tasks featuring uniform appearance and diverse motion remain difficult. Inspired by DETR, the tracking-by-attenti...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-07-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/14/8010 |
_version_ | 1827734193698766848 |
---|---|
author | Yifan Yang Ziqi He Jiaxu Wan Ding Yuan Hanyang Liu Xuliang Li Hong Zhang |
author_facet | Yifan Yang Ziqi He Jiaxu Wan Ding Yuan Hanyang Liu Xuliang Li Hong Zhang |
author_sort | Yifan Yang |
collection | DOAJ |
description | Multi-object tracking (MOT) is one of the significant directions of computer vision. Though existing methods can solve simple tasks like pedestrian tracking well, some complex downstream tasks featuring uniform appearance and diverse motion remain difficult. Inspired by DETR, the tracking-by-attention (TBA) method uses transformers to accomplish multi-object tracking tasks. However, there are still issues with existing TBA methods within the TBA paradigm, such as difficulty detecting and tracking objects due to gradient conflict in shared parameters, and insufficient use of features to distinguish similar objects. We introduce FusionTrack to address these issues. It utilizes a joint track-detection decoder and a score-guided multi-level query fuser to enhance the usage of information within and between frames. With these improvements, FusionTrack achieves 11.1% higher by HOTA metric on the DanceTrack dataset compared with the baseline model MOTR. |
first_indexed | 2024-03-11T01:21:58Z |
format | Article |
id | doaj.art-d410bfea1d9644488a59eadc1f877a6a |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-11T01:21:58Z |
publishDate | 2023-07-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-d410bfea1d9644488a59eadc1f877a6a2023-11-18T18:06:50ZengMDPI AGApplied Sciences2076-34172023-07-011314801010.3390/app13148010FusionTrack: Multiple Object Tracking with Enhanced Information UtilizationYifan Yang0Ziqi He1Jiaxu Wan2Ding Yuan3Hanyang Liu4Xuliang Li5Hong Zhang6Institute of Artificial Intelligence, Beihang University, Beijing 100191, ChinaInstitute of Artificial Intelligence, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaSchool of Astronautics, Beihang University, Beijing 100191, ChinaMulti-object tracking (MOT) is one of the significant directions of computer vision. Though existing methods can solve simple tasks like pedestrian tracking well, some complex downstream tasks featuring uniform appearance and diverse motion remain difficult. Inspired by DETR, the tracking-by-attention (TBA) method uses transformers to accomplish multi-object tracking tasks. However, there are still issues with existing TBA methods within the TBA paradigm, such as difficulty detecting and tracking objects due to gradient conflict in shared parameters, and insufficient use of features to distinguish similar objects. We introduce FusionTrack to address these issues. It utilizes a joint track-detection decoder and a score-guided multi-level query fuser to enhance the usage of information within and between frames. With these improvements, FusionTrack achieves 11.1% higher by HOTA metric on the DanceTrack dataset compared with the baseline model MOTR.https://www.mdpi.com/2076-3417/13/14/8010multiple-object trackingobject detectioncomputer visiontransformer |
spellingShingle | Yifan Yang Ziqi He Jiaxu Wan Ding Yuan Hanyang Liu Xuliang Li Hong Zhang FusionTrack: Multiple Object Tracking with Enhanced Information Utilization Applied Sciences multiple-object tracking object detection computer vision transformer |
title | FusionTrack: Multiple Object Tracking with Enhanced Information Utilization |
title_full | FusionTrack: Multiple Object Tracking with Enhanced Information Utilization |
title_fullStr | FusionTrack: Multiple Object Tracking with Enhanced Information Utilization |
title_full_unstemmed | FusionTrack: Multiple Object Tracking with Enhanced Information Utilization |
title_short | FusionTrack: Multiple Object Tracking with Enhanced Information Utilization |
title_sort | fusiontrack multiple object tracking with enhanced information utilization |
topic | multiple-object tracking object detection computer vision transformer |
url | https://www.mdpi.com/2076-3417/13/14/8010 |
work_keys_str_mv | AT yifanyang fusiontrackmultipleobjecttrackingwithenhancedinformationutilization AT ziqihe fusiontrackmultipleobjecttrackingwithenhancedinformationutilization AT jiaxuwan fusiontrackmultipleobjecttrackingwithenhancedinformationutilization AT dingyuan fusiontrackmultipleobjecttrackingwithenhancedinformationutilization AT hanyangliu fusiontrackmultipleobjecttrackingwithenhancedinformationutilization AT xuliangli fusiontrackmultipleobjecttrackingwithenhancedinformationutilization AT hongzhang fusiontrackmultipleobjecttrackingwithenhancedinformationutilization |