Seeing What You’re Told: Sentence-Guided Activity Recognition In Video
We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, thereby providing a medium, not only for top-down and bottom-up integration, bu...
Main Authors: | Siddharth, Narayanaswamy, Barbu, Andrei, Siskind, Jeffrey Mark |
---|---|
Format: | Technical Report |
Language: | en_US |
Published: |
Center for Brains, Minds and Machines (CBMM), arXiv
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/100169 |
Similar Items
-
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities
by: Berzak, Yevgeni, et al.
Published: (2016) -
Breaking mirrors - you're more than what you see
by: Mahony, Yan An Margaret, et al.
Published: (2018) -
TeLLMe what you see: using LLMs to explain neurons in vision models
by: Guertler, Leon
Published: (2024) -
Seeing What Your Programs Are Doing
by: Lieberman, Henry
Published: (2004) -
If you could see what I mean : descriptions of video in an anthropologist's video notebook
by: Aguierre Smith, Thomas G
Published: (2005)