Deep learning for detecting multiple space-time action tubes in videos
In this work, we propose an approach to the spatiotemporal localisation (detection) and classification of multiple concurrent actions within temporally untrimmed videos. Our framework is composed of three stages. In stage 1, appearance and motion detection networks are employed to localise and score...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Published: |
British Machine Vision Conference
2016
|