Multimodal Features Alignment for Vision–Language Object Tracking

Vision–language tracking presents a crucial challenge in multimodal object tracking. Integrating language features and visual features can enhance target localization and improve the stability and accuracy of the tracking process. However, most existing fusion models in vision–language trackers simp...

Full description

Bibliographic Details
Main Authors: Ping Ye, Gang Xiao, Jun Liu
Format: Article
Language:English
Published: MDPI AG 2024-03-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/16/7/1168