Speech2Action: Cross-modal supervision for action recognition

Is it possible to guess human action from dialogue alone? In this work we investigate the link between spoken words and actions in movies. We note that movie screenplays describe actions, as well as contain the speech of characters and hence can be used to learn this correlation with no additional s...

Full description

Bibliographic Details
Main Authors: Nagrani, A, Sun, C, Ross, D, Sukthankar, R, Schmid, C, Zisserman, A
Format: Conference item
Language:English
Published: IEEE 2020