Learning attentional policies for tracking and recognition in video with deep networks

We propose a novel attentional model for simultaneous object tracking and recognition that is driven by gaze data. Motivated by theories of the human perceptual system, the model consists of two interacting pathways: ventral and dorsal. The ventral pathway models object appearance and classification...

Fuld beskrivelse

Bibliografiske detaljer
Main Authors:	Bazzani, L, Freitas, N, Larochelle, H, Murino, V, Ting, J
Format:	Conference item
Udgivet:	ACM 2011

_version_	1826292383701008384
author	Bazzani, L Freitas, N Larochelle, H Murino, V Ting, J
author_facet	Bazzani, L Freitas, N Larochelle, H Murino, V Ting, J
author_sort	Bazzani, L
collection	OXFORD
description	We propose a novel attentional model for simultaneous object tracking and recognition that is driven by gaze data. Motivated by theories of the human perceptual system, the model consists of two interacting pathways: ventral and dorsal. The ventral pathway models object appearance and classification using deep (factored)-restricted Boltzmann machines. At each point in time, the observations consist of retinal images, with decaying resolution toward the periphery of the gaze. The dorsal pathway models the location, orientation, scale and speed of the attended object. The posterior distribution of these states is estimated with particle filtering. Deeper in the dorsal pathway, we encounter an attentional mechanism that learns to control gazes so as to minimize tracking uncertainty. The approach is modular (with each module easily replaceable with more sophisticated algorithms), straightforward to implement, practically efficient, and works well in simple video sequences.
first_indexed	2024-03-07T03:13:50Z
format	Conference item
id	oxford-uuid:b51e3858-2cc2-43f2-9e49-a8ca3d07d999
institution	University of Oxford
last_indexed	2024-03-07T03:13:50Z
publishDate	2011
publisher	ACM
record_format	dspace
spelling	oxford-uuid:b51e3858-2cc2-43f2-9e49-a8ca3d07d9992022-03-27T04:31:02ZLearning attentional policies for tracking and recognition in video with deep networksConference itemhttp://purl.org/coar/resource_type/c_5794uuid:b51e3858-2cc2-43f2-9e49-a8ca3d07d999Department of Computer ScienceACM2011Bazzani, LFreitas, NLarochelle, HMurino, VTing, JWe propose a novel attentional model for simultaneous object tracking and recognition that is driven by gaze data. Motivated by theories of the human perceptual system, the model consists of two interacting pathways: ventral and dorsal. The ventral pathway models object appearance and classification using deep (factored)-restricted Boltzmann machines. At each point in time, the observations consist of retinal images, with decaying resolution toward the periphery of the gaze. The dorsal pathway models the location, orientation, scale and speed of the attended object. The posterior distribution of these states is estimated with particle filtering. Deeper in the dorsal pathway, we encounter an attentional mechanism that learns to control gazes so as to minimize tracking uncertainty. The approach is modular (with each module easily replaceable with more sophisticated algorithms), straightforward to implement, practically efficient, and works well in simple video sequences.
spellingShingle	Bazzani, L Freitas, N Larochelle, H Murino, V Ting, J Learning attentional policies for tracking and recognition in video with deep networks
title	Learning attentional policies for tracking and recognition in video with deep networks
title_full	Learning attentional policies for tracking and recognition in video with deep networks
title_fullStr	Learning attentional policies for tracking and recognition in video with deep networks
title_full_unstemmed	Learning attentional policies for tracking and recognition in video with deep networks
title_short	Learning attentional policies for tracking and recognition in video with deep networks
title_sort	learning attentional policies for tracking and recognition in video with deep networks
work_keys_str_mv	AT bazzanil learningattentionalpoliciesfortrackingandrecognitioninvideowithdeepnetworks AT freitasn learningattentionalpoliciesfortrackingandrecognitioninvideowithdeepnetworks AT larochelleh learningattentionalpoliciesfortrackingandrecognitioninvideowithdeepnetworks AT murinov learningattentionalpoliciesfortrackingandrecognitioninvideowithdeepnetworks AT tingj learningattentionalpoliciesfortrackingandrecognitioninvideowithdeepnetworks

Learning attentional policies for tracking and recognition in video with deep networks

Lignende værker