Seeing What You’re Told: Sentence-Guided Activity Recognition In Video
We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, thereby providing a medium, not only for top-down and bottom-up integration, bu...
Main Authors: | , , |
---|---|
Format: | Technical Report |
Language: | en_US |
Published: |
Center for Brains, Minds and Machines (CBMM), arXiv
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/100169 |