Keeping your eye on the ball: Trajectory attention in video transformers

In video transformers, the time dimension is often treated in the same way as the two spatial dimensions. However, in a scene where objects or the camera may move, a physical point imaged at one location in frame t may be entirely unrelated to what is found at that location in frame t + k. These tem...

ver descrição completa

Detalhes bibliográficos
Main Authors: Patrick, M, Campbell, D, Asano, Y, Misra, I, Metze, F, Feichtenhofer, C, Vedaldi, A, Henriques, JF
Formato: Conference item
Idioma:English
Publicado em: Neural Information Processing Systems Foundation 2021