Action Recognition by Hierarchical Sequence Summarization

Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization appro...

Full description

Bibliographic Details
Main Authors: Song, Yale, Morency, Louis-Philippe, Davis, Randall
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: 2014
Online Access:http://hdl.handle.net/1721.1/86123
https://orcid.org/0000-0001-5232-7281
_version_ 1826207839291441152
author Song, Yale
Morency, Louis-Philippe
Davis, Randall
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Song, Yale
Morency, Louis-Philippe
Davis, Randall
author_sort Song, Yale
collection MIT
description Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization approach for action recognition that learns multiple layers of discriminative feature representations at different temporal granularities. We build up a hierarchy dynamically and recursively by alternating sequence learning and sequence summarization. For sequence learning we use CRFs with latent variables to learn hidden spatio-temporal dynamics, for sequence summarization we group observations that have similar semantic meaning in the latent space. For each layer we learn an abstract feature representation through non-linear gate functions. This procedure is repeated to obtain a hierarchical sequence summary representation. We develop an efficient learning method to train our model and show that its complexity grows sub linearly with the size of the hierarchy. Experimental results show the effectiveness of our approach, achieving the best published results on the Arm Gesture and Canal9 datasets.
first_indexed 2024-09-23T13:55:46Z
format Article
id mit-1721.1/86123
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T13:55:46Z
publishDate 2014
record_format dspace
spelling mit-1721.1/861232022-10-01T18:02:55Z Action Recognition by Hierarchical Sequence Summarization Song, Yale Morency, Louis-Philippe Davis, Randall Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Song, Yale Davis, Randall Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization approach for action recognition that learns multiple layers of discriminative feature representations at different temporal granularities. We build up a hierarchy dynamically and recursively by alternating sequence learning and sequence summarization. For sequence learning we use CRFs with latent variables to learn hidden spatio-temporal dynamics, for sequence summarization we group observations that have similar semantic meaning in the latent space. For each layer we learn an abstract feature representation through non-linear gate functions. This procedure is repeated to obtain a hierarchical sequence summary representation. We develop an efficient learning method to train our model and show that its complexity grows sub linearly with the size of the hierarchy. Experimental results show the effectiveness of our approach, achieving the best published results on the Arm Gesture and Canal9 datasets. United States. Office of Naval Research (N000140910625) National Science Foundation (U.S.) (IIS-1018055) United States. Army Research, Development, and Engineering Command 2014-04-11T18:42:47Z 2014-04-11T18:42:47Z 2013-06 Article http://purl.org/eprint/type/ConferencePaper 978-0-7695-4989-7 http://hdl.handle.net/1721.1/86123 Song, Yale, Louis-Philippe Morency, and Randall Davis. “Action Recognition by Hierarchical Sequence Summarization.” 2013 IEEE Conference on Computer Vision and Pattern Recognition (n.d.). https://orcid.org/0000-0001-5232-7281 en_US http://dx.doi.org/10.1109/CVPR.2013.457 Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf MIT web domain
spellingShingle Song, Yale
Morency, Louis-Philippe
Davis, Randall
Action Recognition by Hierarchical Sequence Summarization
title Action Recognition by Hierarchical Sequence Summarization
title_full Action Recognition by Hierarchical Sequence Summarization
title_fullStr Action Recognition by Hierarchical Sequence Summarization
title_full_unstemmed Action Recognition by Hierarchical Sequence Summarization
title_short Action Recognition by Hierarchical Sequence Summarization
title_sort action recognition by hierarchical sequence summarization
url http://hdl.handle.net/1721.1/86123
https://orcid.org/0000-0001-5232-7281
work_keys_str_mv AT songyale actionrecognitionbyhierarchicalsequencesummarization
AT morencylouisphilippe actionrecognitionbyhierarchicalsequencesummarization
AT davisrandall actionrecognitionbyhierarchicalsequencesummarization