Helping hands: an object-aware ego-centric video recognition model

We introduce an object-aware decoder for improving the performance of spatio-temporal representations on egocentric videos. The key idea is to enhance object-awareness during training by tasking the model to predict hand positions, object positions, and the semantic label of the objects using paired...

Descrición completa

Detalles Bibliográficos
Main Authors: Zhang, C, Gupta, A, Zisserman, A
Formato: Conference item
Idioma:English
Publicado: IEEE 2024