Joint Appearance and Motion Model With Temporal Transformer for Multiple Object Tracking

The problem of multi-object tracking (MOT) in the real world poses several challenging tasks, such as similar appearance, occlusion, and extreme articulation motion. In this paper, we propose a novel joint appearance and motion model, which is robust to diverse motion and objects with similar unifor...

Full description

Bibliographic Details
Main Authors:	Hyunseop Kim, Hyo-Jun Lee, Hanul Kim, Seong-Gyun Jeong, Yeong Jun Koh
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Multi-object tracking tracking-by-detection online approach
Online Access:	https://ieeexplore.ieee.org/document/10319413/

_version_	1827592961635909632
author	Hyunseop Kim Hyo-Jun Lee Hanul Kim Seong-Gyun Jeong Yeong Jun Koh
author_facet	Hyunseop Kim Hyo-Jun Lee Hanul Kim Seong-Gyun Jeong Yeong Jun Koh
author_sort	Hyunseop Kim
collection	DOAJ
description	The problem of multi-object tracking (MOT) in the real world poses several challenging tasks, such as similar appearance, occlusion, and extreme articulation motion. In this paper, we propose a novel joint appearance and motion model, which is robust to diverse motion and objects with similar uniform appearance. The proposed MOT method includes a temporal transformer, a motion estimation module and a ReID embedding module. The temporal transformer is designed to convey object-aware features to the ReID embedding and motion estimation modules. The ReID embedding module extracts ReID features of the detected objects, while motion estimation module predicts expected locations of the previously tracked objects in the current frame. Also, we present a motion-guided association to fuse outputs of the appearance and motion modules effectively. Experimental results demonstrate that the proposed MOT method outperforms the state-of-the-arts on the TAO and DanceTrack datasets that have objects with diverse motions and similar appearances. Furthermore, the proposed MOT provides stable performance on MOT17 and MOT20 that contain objects with simple and regular motion patterns.
first_indexed	2024-03-09T02:04:15Z
format	Article
id	doaj.art-e72ebb086eba44018735a7fab4f42f51
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-09T02:04:15Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-e72ebb086eba44018735a7fab4f42f512023-12-08T00:07:02ZengIEEEIEEE Access2169-35362023-01-011113379213380310.1109/ACCESS.2023.333336610319413Joint Appearance and Motion Model With Temporal Transformer for Multiple Object TrackingHyunseop Kim0Hyo-Jun Lee1Hanul Kim2Seong-Gyun Jeong3Yeong Jun Koh4https://orcid.org/0000-0003-1805-2960Department of Computer Science and Engineering, Chungnam National University, Daejeon, South KoreaDepartment of Computer Science and Engineering, Chungnam National University, Daejeon, South KoreaDepartment of Applied Artificial Intelligence, Seoul National University of Science and Technology, Seoul, South Korea42dot Inc., Seoul, South KoreaDepartment of Computer Science and Engineering, Chungnam National University, Daejeon, South KoreaThe problem of multi-object tracking (MOT) in the real world poses several challenging tasks, such as similar appearance, occlusion, and extreme articulation motion. In this paper, we propose a novel joint appearance and motion model, which is robust to diverse motion and objects with similar uniform appearance. The proposed MOT method includes a temporal transformer, a motion estimation module and a ReID embedding module. The temporal transformer is designed to convey object-aware features to the ReID embedding and motion estimation modules. The ReID embedding module extracts ReID features of the detected objects, while motion estimation module predicts expected locations of the previously tracked objects in the current frame. Also, we present a motion-guided association to fuse outputs of the appearance and motion modules effectively. Experimental results demonstrate that the proposed MOT method outperforms the state-of-the-arts on the TAO and DanceTrack datasets that have objects with diverse motions and similar appearances. Furthermore, the proposed MOT provides stable performance on MOT17 and MOT20 that contain objects with simple and regular motion patterns.https://ieeexplore.ieee.org/document/10319413/Multi-object trackingtracking-by-detectiononline approach
spellingShingle	Hyunseop Kim Hyo-Jun Lee Hanul Kim Seong-Gyun Jeong Yeong Jun Koh Joint Appearance and Motion Model With Temporal Transformer for Multiple Object Tracking IEEE Access Multi-object tracking tracking-by-detection online approach
title	Joint Appearance and Motion Model With Temporal Transformer for Multiple Object Tracking
title_full	Joint Appearance and Motion Model With Temporal Transformer for Multiple Object Tracking
title_fullStr	Joint Appearance and Motion Model With Temporal Transformer for Multiple Object Tracking
title_full_unstemmed	Joint Appearance and Motion Model With Temporal Transformer for Multiple Object Tracking
title_short	Joint Appearance and Motion Model With Temporal Transformer for Multiple Object Tracking
title_sort	joint appearance and motion model with temporal transformer for multiple object tracking
topic	Multi-object tracking tracking-by-detection online approach
url	https://ieeexplore.ieee.org/document/10319413/
work_keys_str_mv	AT hyunseopkim jointappearanceandmotionmodelwithtemporaltransformerformultipleobjecttracking AT hyojunlee jointappearanceandmotionmodelwithtemporaltransformerformultipleobjecttracking AT hanulkim jointappearanceandmotionmodelwithtemporaltransformerformultipleobjecttracking AT seonggyunjeong jointappearanceandmotionmodelwithtemporaltransformerformultipleobjecttracking AT yeongjunkoh jointappearanceandmotionmodelwithtemporaltransformerformultipleobjecttracking

Joint Appearance and Motion Model With Temporal Transformer for Multiple Object Tracking

Similar Items