Hierarchical attentive recurrent tracking
Class-agnostic object tracking is particularly difficult in cluttered environments as target specific discriminative models cannot be learned a priori. Inspired by how the human visual cortex employs spatial attention and separate “where” and “what” processing pathways to actively suppress irrelevan...
Main Authors: | , , |
---|---|
Format: | Conference item |
Published: |
Neural Information Processing Systems
2018
|
_version_ | 1797081779003195392 |
---|---|
author | Kosiorek, A Bewley, A Posner, H |
author_facet | Kosiorek, A Bewley, A Posner, H |
author_sort | Kosiorek, A |
collection | OXFORD |
description | Class-agnostic object tracking is particularly difficult in cluttered environments as target specific discriminative models cannot be learned a priori. Inspired by how the human visual cortex employs spatial attention and separate “where” and “what” processing pathways to actively suppress irrelevant visual features, this work develops a hierarchical attentive recurrent model for single object tracking in videos. The first layer of attention discards the majority of background by selecting a region containing the object of interest, while the subsequent layers tune in on visual features particular to the tracked object. This framework is fully differentiable and can be trained in a purely data driven fashion by gradient methods. To improve training convergence, we augment the loss function with terms for a number of auxiliary tasks relevant for tracking. Evaluation of the proposed model is performed on two datasets of increasing difficulty: pedestrian tracking on the KTH activity recognition dataset and the KITTI object tracking dataset. |
first_indexed | 2024-03-07T01:18:49Z |
format | Conference item |
id | oxford-uuid:8fa0fddd-7b5f-4903-b40d-9b4133a3965d |
institution | University of Oxford |
last_indexed | 2024-03-07T01:18:49Z |
publishDate | 2018 |
publisher | Neural Information Processing Systems |
record_format | dspace |
spelling | oxford-uuid:8fa0fddd-7b5f-4903-b40d-9b4133a3965d2022-03-26T23:05:47ZHierarchical attentive recurrent trackingConference itemhttp://purl.org/coar/resource_type/c_5794uuid:8fa0fddd-7b5f-4903-b40d-9b4133a3965dSymplectic Elements at OxfordNeural Information Processing Systems2018Kosiorek, ABewley, APosner, HClass-agnostic object tracking is particularly difficult in cluttered environments as target specific discriminative models cannot be learned a priori. Inspired by how the human visual cortex employs spatial attention and separate “where” and “what” processing pathways to actively suppress irrelevant visual features, this work develops a hierarchical attentive recurrent model for single object tracking in videos. The first layer of attention discards the majority of background by selecting a region containing the object of interest, while the subsequent layers tune in on visual features particular to the tracked object. This framework is fully differentiable and can be trained in a purely data driven fashion by gradient methods. To improve training convergence, we augment the loss function with terms for a number of auxiliary tasks relevant for tracking. Evaluation of the proposed model is performed on two datasets of increasing difficulty: pedestrian tracking on the KTH activity recognition dataset and the KITTI object tracking dataset. |
spellingShingle | Kosiorek, A Bewley, A Posner, H Hierarchical attentive recurrent tracking |
title | Hierarchical attentive recurrent tracking |
title_full | Hierarchical attentive recurrent tracking |
title_fullStr | Hierarchical attentive recurrent tracking |
title_full_unstemmed | Hierarchical attentive recurrent tracking |
title_short | Hierarchical attentive recurrent tracking |
title_sort | hierarchical attentive recurrent tracking |
work_keys_str_mv | AT kosioreka hierarchicalattentiverecurrenttracking AT bewleya hierarchicalattentiverecurrenttracking AT posnerh hierarchicalattentiverecurrenttracking |