Helping hands: an object-aware ego-centric video recognition model

We introduce an object-aware decoder for improving the performance of spatio-temporal representations on egocentric videos. The key idea is to enhance object-awareness during training by tasking the model to predict hand positions, object positions, and the semantic label of the objects using paired...

Full description

Bibliographic Details
Main Authors: Zhang, C, Gupta, A, Zisserman, A
Format: Conference item
Language:English
Published: IEEE 2024