Spatio-temporal action instance segmentation and localisation

Current state-of-the-art human action recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame. In this work we address the problem of action localisation and instance segmentation in which multiple concurrent actions of the same class may b...

Full description

Bibliographic Details
Main Authors: Saha, S, Singh, G, Sapienza, M, Torr, PHS, Cuzzolin, F
Other Authors: Nocet, N
Format: Book section
Language:English
Published: Springer 2020
_version_ 1826310630023364608
author Saha, S
Singh, G
Sapienza, M
Torr, PHS
Cuzzolin, F
author2 Nocet, N
author_facet Nocet, N
Saha, S
Singh, G
Sapienza, M
Torr, PHS
Cuzzolin, F
author_sort Saha, S
collection OXFORD
description Current state-of-the-art human action recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame. In this work we address the problem of action localisation and instance segmentation in which multiple concurrent actions of the same class may be segmented out of an image sequence. We cast the action tube extraction as an energy maximisation problem in which configurations of region proposals in each frame are assigned a cost and the best action tubes are selected via two passes of dynamic programming. One pass associates region proposals in space and time for each action category, and another pass is used to solve for the tube’s temporal extent and to enforce a smooth label sequence through the video. In addition, by taking advantage of recent work on action foreground-background segmentation, we are able to associate each tube with class-specific segmentations. We demonstrate the performance of our algorithm on the challenging LIRIS-HARL dataset and achieve a new state-of-the-art result which is 14.3 times better than previous methods.
first_indexed 2024-03-07T07:54:45Z
format Book section
id oxford-uuid:94c3947f-1315-416c-a5b5-ac33284407bc
institution University of Oxford
language English
last_indexed 2024-03-07T07:54:45Z
publishDate 2020
publisher Springer
record_format dspace
spelling oxford-uuid:94c3947f-1315-416c-a5b5-ac33284407bc2023-08-10T10:36:16ZSpatio-temporal action instance segmentation and localisationBook sectionhttp://purl.org/coar/resource_type/c_1843uuid:94c3947f-1315-416c-a5b5-ac33284407bcEnglishSymplectic ElementsSpringer2020Saha, SSingh, GSapienza, MTorr, PHSCuzzolin, FNocet, NSciutti, ARea, FCurrent state-of-the-art human action recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame. In this work we address the problem of action localisation and instance segmentation in which multiple concurrent actions of the same class may be segmented out of an image sequence. We cast the action tube extraction as an energy maximisation problem in which configurations of region proposals in each frame are assigned a cost and the best action tubes are selected via two passes of dynamic programming. One pass associates region proposals in space and time for each action category, and another pass is used to solve for the tube’s temporal extent and to enforce a smooth label sequence through the video. In addition, by taking advantage of recent work on action foreground-background segmentation, we are able to associate each tube with class-specific segmentations. We demonstrate the performance of our algorithm on the challenging LIRIS-HARL dataset and achieve a new state-of-the-art result which is 14.3 times better than previous methods.
spellingShingle Saha, S
Singh, G
Sapienza, M
Torr, PHS
Cuzzolin, F
Spatio-temporal action instance segmentation and localisation
title Spatio-temporal action instance segmentation and localisation
title_full Spatio-temporal action instance segmentation and localisation
title_fullStr Spatio-temporal action instance segmentation and localisation
title_full_unstemmed Spatio-temporal action instance segmentation and localisation
title_short Spatio-temporal action instance segmentation and localisation
title_sort spatio temporal action instance segmentation and localisation
work_keys_str_mv AT sahas spatiotemporalactioninstancesegmentationandlocalisation
AT singhg spatiotemporalactioninstancesegmentationandlocalisation
AT sapienzam spatiotemporalactioninstancesegmentationandlocalisation
AT torrphs spatiotemporalactioninstancesegmentationandlocalisation
AT cuzzolinf spatiotemporalactioninstancesegmentationandlocalisation